



TRADITIONAL EXAMINATIONS 
AND NEW-TYPE TESTS 




gbe Century 

traditionM^ 

EXAMINATIONS AND 
NEW-TYPE TESTS 


BY 

C. W. ODELI-., Ph-D. 

ASSISTANT I>lllBCTOR, 

BUREAU or EnUCATIONAl4 RESEARCH, 
UNlVERSITr OF XLT,IN01S 





THE CENTURY CO. 
New York G? London, 





Copyright, 1D28, by 
Th® CBNTtTKT Co. 

268 


WKTXP IK V. i« A. 



PREFACE 


In the preparation of this book the needs of teachers 
actually in service have been foremost in the writer’s 
mind, though it has been his intention also to make the 
book serviceable to prospective teachers and others 
interested in the field covered. Its purpose is to present 
both the traditional and the new-type examination in 
rather detailed fashion, to point out the merits and the 
limitations of each, and to suggest how each may be so 
constructed and used as to yield the greatest returns. 
It would ijerhaps have been desirable to include stan- 
dardized tests as well as examinations made by the 
teacher, but anything approaching a complete treat- 
ment of both is hardly practicable within the limits of 
a single volume. It seemed more practicable because 
of the much smaller amount of space required, and also 
more essential, to include a discussion of school marks. 
Accordingly two chapters have been devoted to this 
topic. The remainder of the book deals rather directly 
with the two general types of examinations already re- 
ferred to, the traditional and the new-type. 

In view of the very considerable amount of attention 
devoted to testing in the educational literature of the 
past few years, it is possible that readers will ask what 
justification there is for another treatment of the sub- 
ject. Standardized tests have received the lion’s share 
of this recent attention and the new-type tests most of 
the rest. The treatment of the traditional examination 
has been mostly incidental and has consisted largely 



■VI 


PREFA'OB 

- » * , •;•*•** **t 

6 ^ ^^rjicti'we.iiitfibeijtiiaftetmstructiv^^ criticism. More- 
over, most o,f. tji^ mass of material dealing with new- 
type tests is jiiVthe form of periodical articles, pam- 
phlets, monographs, single chapters in books, and so 
forth and is, therefore, decidedly fragmentary. The 
chief merit claimed for this boolc is that it is based 
upon a comprehensive review of the material just re- 
ferred to, and therefore, it is hoped, has gathered to- 
gether the best of this material into a single unified 
treatment, rather than that it is to any considerable 
extent an original contribution. 


C. W. Odell 



EDITOR’S INTRODUCTION 


Almost all who are engaged in educational work, 
either as students or teachers, are vitally interested in 
the subject of marks and examinations. While a not 
inconsiderable number of instructors are prone to say 
that they have no confidence in the results of tests of 
any kind and wish that they could be eliminated, there 
is doubtless little probability that there will be any 
radical change in the means by which recognition is 
given for the successful completion of work in educa- 
tion. It is also fully as probable that for a long time 
to come instructors will feel compelled, in order to 
assure themselves from time to time of the character 
of the work which is being given in their courses, to 
offer some kind of so-called test designed to furnish 
this information. 

This situation makes the publication of a book on 
marks and the new examinations especially timely. For 
a number of years a largo amount of experimentation 
has been in progress looking to the formulation of new 
typos of examinations which may be assumed to be 
more objective in character than the traditional exam- 
inations and which may lessen, if not eliminate, some 
of the evils which are incident to the use of the older 
typo of test. A large number of educational institu- 
tions, ranging from the elementary schools to the pro- 
fessional schools, are making a more or less extensive 
•use of the various types of the so-called new examina- 
tions, The author of this volume has made an unusually 



viii EDITOR’S INTRODUCTION 

thorough study of the whole subject and presents a 
definitely organized statement and explanation of all 
of the more commonly known types of new examina- 
tions and many of the modified types which have not 
received as wide publicity and have not been used to 
nearly as great an extent. 

The subject is of such vital importance for all 
classes of school work that it seems highly desirable 
that a popular presentation should be available to all 
types of educators. While there is very little reason 
to believe,, that the educational world is going to aban- 
don the traditional examinations, because there are 
certain educational values obvious to all in their use, 
there is little doubt that many of the newer types of 
exanoinations as illustrated in this volume will be used 
with increasing frequency. All those engaged in educa- 
tion should be well informed concerning these variants, 
their weaknesses and their advantages. 

In the belief that this volume will prove of real value 
in extending widely the intelligent use of objective 
tests and the more critical and desirable selection of 
questions for discussion along traditional lines, this 
volxuue is submitted. 


C. E. Chadsbv 



CONTENTS 


iAP®mi paob 

I THE PAST AND PRESENT STATUS OF EX- 
AMINATIONS 3 

1 Introduction. A brie*’ history op examina- 

tions AND OP TUB CRITICISMS THEREOF . . 3 

2 ADVP.RSK CRITICISMS OP EXAMINATIONS ... 9 

3 The defense op examinations against the 

CRITICISMS JUST GIVEN 11 

4 Advantages op standardized tests over exam- 

inations PREPARED BY THE TEACHER ... 19 

^ Advantages op examinations prepared by the 


THACIIER over STANDARDIZED TESTS ... 27 

6 Summary 30 

II WHAT ARB GOOD EXAMINATIONS? . . 32 

'■ ^ I The purposes examinations siioudd serve . 32 

2 Tub qualities op good examinations ... 40 

3 Summary 58 

III HOW TO MAKE AND GIVE EXAMINA- 

TIONS 59 

C -'-lr-TUE PRBI'ABA'nON OP GOOD EXAMINATIONS . . 59 

2 Tub administration op examinations ... 64 

3 Summary 80 

IV SCORING PUPILS' RESPONSES .... 81 

1 Tub WBaGHTiNO op exercises 81 

2 Scoring examination and test pai’ers . . 86 

3 Changing scores into marks 97 

4 Summary 107 

lx 



CONTENTS 


V THE MARKING SYSTEM AND ITS MEAN- 
ING 109 

1 Should makes be used at all? . . . . . 109 

2 Upon what should makes be based ? . . . 113 

3 What makes should be employed? . . . 133 

4 Summary 139 

VI THE DISTRIBUTION OP MARKS . . . .141 

1 Should marks follow the normal ok any 

other fixed frequency distribution? . . 141 

2 Suggested practices concerning the use op 

A STANDARD DISTRIBUTION OF MARKS . . . 149 

3 Assigning marks to pupils in selected or 

non-average groups 159 

4 Adjusting the marks of teachers who do 

NOT conform to THE STANDARD . . .164 

5 Summary 173 

VII MERITS AND LIMITATIONS OP TRADI- 
TIONAL AND NEW-TYPE BXAMINA- 
TIONS 175 

1 Merits and advantages of traditional, and 

limitations and disadvantages of new- 
type, examinations 175 

2 Merits and advantages op new-type, and 

limitations and disadvantages of tradi- 
tional, EXAMINATIONS 183 

3 Summary 203 

VIII EXAMPLES OP TRADITIONAL OR ESSAY 

EXAMINATIONS 205 

1 Types of mental activity to be tested . , 205 

2 BxaMPIJES op essay questions CAIAiING fob the 

TWENTY TYPES OF MENTAL ACTIVITY NAMED 
ABOVE 207 

3 Examples op good discussion questions in 

LITERATURE 210 

4 A COMPLETION ESSAY EXAMINATION . . . .211 



CONTENTS 


XI 


5 Traditionaii examination questions selected 

PBOM THOSE ACTUALLY USED BY PUBLIC- 


SCHOOL TEACHEKS 221 

6 TBAMTIONAL examination EXERCISES BASED 

UPON QUOTATIONS 233 

7 Summary 236 

IX TUB CONSTRUCTION AND USB OF NBW- 

TYPB BXAMINATIONS 237 

1 Constructing new-type examinations . . 237 

2 SooRiNo new-type examinations and han- 

dling THE results ....... 246 

3 The seijoction op the most appropriate types 

POR class-room use 248 

4 Summary 253 

X SINOLB-ANSWBR OR RECALL TESTS . . 255 

1 GBNliiRAL DISCUSSION 255 

2 Ordinary single-answer tests 258 

3 Single-answer tests each containing only 

ONE EXERCISE 261 

4 So-called “association” single-answer tests 263 

5 JlEIi'INmON OR DESCRIPTION SINGLE-ANSWER 

TESTS . 267 

6 SlNGLE-l-nCAMPLE TESTS 270 

7 Pluraij or multiple-example tests . . . 272 

8 Compound single-answer tests .... 277 

9 Summary 279 

XI MULTIPLE-ANSWER TESTS 281 

1 General discussion - 281 

2 Ordinary multiple-answer tests .... 289' 

3 Plural multiple-answer tests .... 310 

4 Compound multiple-answer tests . . . 316 

5 MUL'nPLHJ-REASON TESTS 322 

6 Multiple-description tests 329 

7 Summary 332 



CONTENTS 


XII ALTERNATIVE TESTS 334 

1 Genebal discussion 334 

2 Teub-false exercises 347 

3 Yes-no questions 350 

4 Other varieties of tests having only two 

possible answers 351 

5 Alternative tests which provide a third 

POSSIBLE answer 354 

6 Summary 356 

XIII COMPLETION TESTS 358 

1 General discussion 358 

2 SlMI‘LB completion Tl^STS 362 

3 Completion tests with suggested answers . 366 

4 Other varieties of completion TtssTS . . . 372 

5 Summary 374 

XIV MATCHING TESTS 376 

1 General discussion 376 

2 Ordinary matching tests ...... 378 

3 Compound matching arasTs 384 

4 Summary 387 

XV INCORRBCT-STATBMBNTS TESTS . , . 388 

1 General discussion 388 

2 Examples of incorrect-statements tests . 390 

3 Summary 394 

XVI MISCBIJGANEOUS TYPES OP THE NEW 

EXAMINATION 395 

1 General discussion 395 

2 Identification tests 396 

3 Distinguishing tests 403 

4 Continuity or rearrangement tests . . . 405 

5 Verification or judgment tests .... 412 

6 Analogies tests 416 

7 Summary 420 



CONTENTS jdii 

XVII OBJECTIVE TESTS IN INSTITUTIONS OP 

HIGHER LEARNING 422 

1 General discussion 422 

2 Objective tests in liberal arts courses . . 424 

3 Objective tests in professional courses. . 430 

4 Summary 435 

BIBLIOGRAPHY 439 

INDEX 463 




TABXilS 


TABLES 


pAoa 


I STANDARDS FOB RATING PUPILS, BEBCHVIEW-BEBCHWOOD 

PUBLIC SCHOOLS 117 

II SCALE OP QUALITIES OP WORK, PUBLIC SCHOOLS, GENE- 

SEO, ILLINOIS 118-119 

HI SUGGESTED PERCENTILE DISTRIBUTIONS OP MARKS FOR 

A PIVE-SYMBOL MARKING SYSTEM 153 

IV SUGGESTED l»ERCBNTILE DISTRIBUTIONS OP MARKS FOB 

OTHER THAN FIVE-SYMBOL MARKING SYSTEMS . • 155 


XV 




FIGURES 

IPXaURB PAGB 

1 GRAPHIC REPRESENTATION OP SAMPLING PUPILS ’ ACHIEVE- 
MENTS BY A PEW QUESTIONS 184 


2 GRAPHIC REPRESENTATION OP SAMPLING PUPILS ' ACHIEVE- 
MENTS BY A LARGE NUMBER OP QUESTIONS . . . 185 


xvii 




TRADITIONAL EXAMINATIONS AND 
NEW-TYPE TESTS 




TRADITIONAL EXAMINATIONS 
AND NEW-TYPE TESTS 


CHAPTER I 

THE PAST AND PRESENT STATUS OP EXAM- 
INATIONS 

I. Introduction. A brief history of examinations and 
of the criticisms thereof. Examinations are not new. 
They are in no sense a modern fad or a newfangled 
procedure recently introduced into school practice. On 
the contrary, they have been employed as a regular 
and integral part of school work for hundreds, even 
thousands, of years. A merely casual study of the edu- 
cational practices of the past is sufficient to yield the 
information that the use of examinations of some sort 
or other has been the rule rather than the exception. 
In most countries, and in most of the schools in these 
countries, tests * of achievement have had their place. 
Probably the best known, and certainly one of the most 
elaborate, of the ancient systems of examinations was 
that maintained by the Chinese; but the Greeks, the 
Romans, and many other peoples employed them as 
well. In fact, definite tests of what has been learned 
have had their place in practically every scheme of ed- 
ucation of which we have any record. Many of them 

1 For the «»k« of variety, the word teat will be used synonymously and inter- 
changeably with examination. 



4 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

were not written or even verbal, being rather tests of 
ability to apply and to do, but none the less they were 
examinations designed to test mastery of what was 
supposed to have been learned. When we come to medi- 
eval education, we find numerous accounts not only of 
public disputations and other testing procedures differ- 
ent from those commonly employed at present, but also 
of others very similar to the examinations with which 
all readers are undoubtedly familiar. In some cases 
these exercises, whatever their nature, were taken seri- 
ously by both teachers and pupils, whereas in otliers 
they were mere formalities, but the name and practice 
were general. Not only were they employed in connec- 
tion with school work, but also, notably in China, they 
found a place outside the school-room for purposes 
which are now mostly included under the term civil 
service. At present, examinations are probably more 
prevalent than ever before, being used by practically 
all teachers in rating pupils, by government depart- 
ments in choosing officials, by numerous business con- 
cerns in selecting and promoting employees, and by 
various other agencies. 

Not only are examinations an institution of long 
standing, but also adverse as well as constructive criti- 
cism of them is nothing new. To go no further back 
than the eighteenth century, one finds an Oxford stu- 
dent of 1766 quoted to the effect ( 2 ,® p. 16) that his 
examination for the degree was an absolute farce, con- 
sisting of one very easy question in Hebrew and one in 
history. In this country Horace Mann, writing in 1845 
(lo, p. 37), stated certain points of superiority of what 
he called the “new" method of examining. Presumably, 

2 The numbers in parentheses reiCer to references listed in the Bibliography 
beginning on p. 489. 



THE PAST AND PRESENT STATUS OF EXAMINATIONS 6 

therefore, he had a more or less unfavorable opinion of 
the examination methods in general use in his time. 
Mann especially praised the use of uniform written 
questions instead of oral ones, which were usually dif- 
ferent for different pupils. Apparently, however, no 
such volume of criticism of both the function and form 
of examinations as has appeared within the last few 
years ever arose during any previous period, and 
seemingly few persons at any one time in the past be- 
lieved that anything serious was wrong with current 
practices regarding examinations or that any major 
change was needed. Therefore the form of examina- 
tions and the methods of using them underwent com- 
paratively little change during a long period and only 
recently have modifications of any importance oc- 
curred or even been strongly urged. Although the last 
half-century or loss has witnessed unusual advances in 
providing better buildings and material equipment, 
constructing more adequate curricula, and improving 
the technique of instruction, supervision, and admin- 
istration, examinations were little affected during most 
of this period, but continued to exist in essentially un- 
changed form and to be employed in practically un- 
changed circumstances. It is only within the last few 
years that they have been subjected to any consider- 
able amount of critical evaluation and discussion and 
that the twofold question as to their proper function 
and best form has been raised. 

Undoubtedly the chief cause contributing to raise 
the question just mentioned was the publication of the 
results from a number of investigations which showed, 
or appeared to show, great unreliability * and varia- 

MtilitthilUy may t)6 briefly defined as accuracy in measuring whatever is 
measured* For a fuller discussion, see p. 41. 



6 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


bility of the marks given, examination papers by 
teachers. Prominent in making the studies referred to 
were Johnson (40), Starch and Elliott (8i, 8a, 83), 
Kelly (41), and Dearborn (17). Their work, and also 

Mark, Number 

well known and has produced too 
similar results to justify detailed ac- 
counts of the various studies here. A 
single striking example, which is also 
probably the best-known « one, will 
sufSce. Starch and Elliott (8a) sub- 
mitted an examination paper in ge- 
ometry to a number of teachers of 
mathematics, and were able to secure 
the marks given the paper by 114 
teachers, each marking independ- 
ently, The accompanying distribution of marks is re- 
ported. The actual marks ranged from 28 to 92 per cent, 
75 being the passing mark. One or two examples even 
more extreme than this have come to the writer’s atten- 
tion, and many others showing nearly as great varia- 
bility have been mentioned in educational literature. 
Altogether a large mass of data has been accumulated, 
published, and generally accepted as valid, which offers 
strong evidence that if pupils’ answers to examination 
questions of the type almost exclusively used until very 
recently are marked by the same person a second time 
after he had forgotten his first marks, or by two or 
more individuals working independently, very great 
differences in the marks assigned will frequently result. 

It is only fair to say, however, that although the 
men named above and others have presented the many 
data referred to, their conclusions have not been ac- 
cepted by all persons interested as representative of 


90-94 2 

85-89 7 

80-84 11 

75-79 26 

70-74 13 

65-69 18 

60-64 17 

55-59 8 

50-54 5 

25-49 7 



THE PAST AND PRESENT STATUS OP EXAMINATIONS 7 

ordinary conditions in the schools. In general, those 
who do not admit the validity of the conclusions drawn 
maintain that the marks collected and used were not 
assigned under ordinary school circumstances, but 
rather under conditions which tended to produce much 
greater variability than is the case in general school 
practice. Not only has the evidence for great varia- 
bility been criticized, but also some data have been 
presented which indicate a decidedly high degree of 
reliability or agreement among teachers. Bolton ( 5 ), 
for e.xample, reports on the scoring of twenty-four 
arithmetic papers by twenty-two Seattle teachers. The 
papers were scored by the ordinary percentile system 
and the variations from the average mark for each pa- 
per computed. The average variation was approxi- 
mately 5 per cent. About one-sixth of the variations 
were not greater than 1 per cent, and one-third more, 
or one-half in all, not greater than 3 per cent. Further 
data presented by the same writer show that in the 
marking of single arithmetic questions almost a third 
of the teachers agreed with the average rating exactly, 
and that more than another third differed by not over 
1 per cent. Furthermore, Bolton analyzes some of the 
data which Starch used as evidence of great variabil- 
ity, and as a result claims that they actually show 
decided uniformity in marking. 

Even though it is admitted that the evidence which 
shows great unreliability and variability among 
teachers' marks is not conclusive, the fact remains 
that it has been generally enough accepted as valid 
to create a wide-spread doubt as to the reliability of 
marks as commonly given. The question of how to im- 
prove the situation by constructing examinations, 
training teachers, or both, so that much closer if not 



8 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

absolute agreement in the marks given will result, has 
been raised time and again by college instructors in 
education, administrators, supervisors, and teachers, 
and has received a great deal of attention in print, 
in public addresses, in teachers’ meetings, and in 
courses in education and psychology. 

At first there was a strong tendency to answer the 
question by asserting that examinations prepared by 
the regular teacher should be replaced by standard- 
ized tests.^ Now, however, practically all those who 
have given careful thought to the problem have come 
to realize that these two forms of measuring instru- 
ments are not mutually exclusive or opposed, but that 
each has it proper place in class-room procedure and 
that a complete program of testing for a term or 
semester should almost, if not quite, always include 
the use of both. Therefore, instead of the merely nega- 
tive criticisms which first appeared, among the chief 
of which was the proposal to abolish examinations con- 
structed by the class-room teacher and to substitute 
therefor standardized tests, much constructive criti- 
cism has been offered and many suggestions made as 
to how to improve the form of examinations and to 
make better use of the results. 

From time to time a few persona occupying more or 
loss prominent positions in the field of education have 
advocated the complete abolition of examinations and 

4 A standardlizfid tent in tlm most limitod sense is any test M'hieh has been ffiven 
to a larjfo cnoup^h number of pupils of a pfiven apre, Krade, or other homog;eneou« 
prroup so that the results are fairly adequate indications of what ochievementa 
are actually beinff attained by such pupils in preneral. Usually, however, the term 
standardized is employed to refer to a test that has also been prepared with 
especial care, and is fairly objective, reliable, and valid, and therefore should not 
he applied to an ordinary test or examination, even though it has been given to 
many pupils and the results collected and tabulated. There appears to be a grow- 
ing tendency to shorten the word standardized to standard when need in thia 
connection, but this usage has not yet become common. 



THE PAST AND PRESENT STATUS OP EXAMINATIONS 9 

a few schools and school systems have taken such ac- 
tion. The movement to do so does not appear to he 
gaining headway and perhaps is even less likely to 
become general than it was a few years ago, before 
the recent constructive work in this field. Apparently 
most teachers and others concerned are firmly of the 
opinion that examinations are, and probably always 
will be, an integral part of every well-balanced sys- 
tem of education, an opinion in which the writer em- 
phatically believes that they are justified. Later in the 
discussion certain purposes and functions of examina- 
tions will be stated and discussed, not so much, how- 
ever, from the standpoint of justifying the use of ex- 
aminations as from that of indicating how the most 
and greatest values can be obtained from their use. 

2. Adverse criticisms of examinations. As was stated 
above, there has been a considerable amount of ad- 
verse criticism of examinations recently, within the 
past two decades. Not only have a few writers and 
speakers urged that they be no longer employed, but 
others have advocated radical modifications of their 
form and use. In support of these views a number of 
arguments have been advanced and it is the purpose 
of the next few pages to state and to answer those 
most frequently urged. By so doing it is by no means 
intended to imply that none of the objections made 
possess any validity. On the contrary, most of them 
do point out real weaknesses and abuses in examina- 
tions as frequently, or even usually, employed. The 
writer believes, however, that a careful analysis will 
show that most, or even all, of the unfavorable criti- 
cisms are really criticisms wliich concern the misuse 
of examinations and are not valid against them as 
an institution. In other words, the forms and methods 



10 TRADITIONAL EXAMINATIONS AND NEW-T2RE TESTS 

of using examinations can be so improved that they 
are not, to any considerable degree, subject to the ad- 
verse criticisms given below. 

The following list includes all the unfavorable criti- 
cisms of examinations which have been heard or seen 
at all frequently. Other minor ones have been made 
by some one individual or perhaps by several, but are 
of such slight importance as not to be worth including 
in this list. It should be mentioned also that the 
criticisms given below are those made against exami- 
nations in general, or at least against all written ex- 
aminations, and not those which are directed only at 
certain types or varieties of examinations. 

I. Examinations are injurious to the health of those tak- 
ing them, causing overstrain, nervousness, worry, and other 
undesirable physical and mental results. 

II. The content covered by examination questions does 
not agree with the recognized objectives of educjition, hut 
instead encourages cramming, mere factual memorizing and 
acquiring items of information rather than careful and con- 
tinuous study, reasoning, and other higher, thought processes. 

III. Examinations too often become objectives in them- 
selves, the pupils believing that the chief purpose of study 
is to pass examinations rather than to ma.ster the subject 
or gain mental power. This objection is more or h'ss similar 
to the one stated immediately above, but still is probably 
different enough to warrant separate statement and consid- 
eration. At least it has been so considered by unfavorable 
crities. 

IV. Examinations encourage bluffing and cluiating. This 
occurs both because of the premium which they place on doing 
so successfully, and because of the frequently prevailing con- 
ditions which make bluffing relatively e«isy and cheating com- 
paratively safe. 

V. Examinations develop habits of careless use of English 
and poor handwriting. This results because they emphasize 



THE PAST AND PRESENT STATUS OF EXAMINATIONS II 

writing a large amount as rapidly as possible and thus lead 
to the neglect of good form. 

VI. The time devoted to examinations can be more profit- 
ably used otherwise, for more study, recitation, review, and 
so forth. 

VII. The results of instruction in the field of education 
are intangible and cannot be measured as can production 
in industry or agriculture, physical growth, heat, light, and 
many other products of human or other activity. 

VIIT. Examinations are unnecessary. Capable instructors 
handling classes which are not too large are able to rate the 
work of their pupils without employing examinations. 

3, The defense of examinations against the c.ldc’sms 
just given. It lias already been suggested the- . 5 ;- ,;c . 
eral reply to all these objections is that t k ; 
avoided if examinations are properly con n ‘i 

administered and the results properly empi j . ; . • : 
over, most of them appear to assume »'i;u ti 1. . 
purpose of examinations is that of ri ’i! • , 

assumption which will later be shown 0 ; 

over, the defense of examinations ’ ■!' r.‘ ,■ ... d 

to tho.so brief general statements, bid < im 

will be discussed and answered sf-' - ii;. . not 

at great length. Most important oi' . ■ ;■■> . i - iswill 

bo given as to how to improve evin. : , -ads so 

as to avoid the defects points 1 m'i.. 

I. It is undeniably true tb.d, : ; • . < • have in 

some instances resulted in i;:,* . . ■ , uysical or 

.mental well-being of tho.so i.s.i ’■ ' ■ The same 

is true of study and of all »•< 1 . .-k as well as 

of all work outside the si bf .b xe believes in 

the total abolition of sli i .r . work because 
of this fact. If examhi i.a: . under hygienic 

conditions, in rooms rif Ti.' jmperature, with 



X2 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

adequate ventilation, with sufficient light coming from 
the right direction, if the pupils have satisfactory seats 
and desks, if the length of the examination periods is 
not too great, if the children are not wrought up emo- 
tionally through excessive fear of failure and through 
the use of the examination as a club, if the same pre- 
cautions are observed which should be followed in the 
case of any activity, there is no reason why undesir- 
able results upon health should ensue. It is true that 
examinations are usually periods of work under 
greater pressure than usual, but similar experiences 
must be faced in practically every vocational and other 
activity of life, so that examinations are not to be con- 
demned on this account unless all activities in which 
there are times of unusual stress are likewise con- 
demned. Indeed, on the other hand, examinations will 
be making a real contribution to education if they as- 
sist in preparing individuals to meet such periods of 
stress successfully. 

II. The fact that examinations have frequently not 
been in agreement with the recognized objectives of 
education' has resulted from the ignorance or careless- 
ness of those who made them. It is undoubtedly much 
easier to prepare questions which test mere stock of 
information and knowledge of facts than to construct 
them so that thej measure reasoning ability, power 
to apply and adapt, and memal growth. It is, how- 
ever, entirely possible to malm examinations which do 
measure the latter qualitiesvAlso, it requires consid- 
erably less time and effort to construct an examination 
over the first topics which come to mind than over the 
most important ones, or those which for some other 
reason should be included. Therefore, the remedy is 
not to do away with examinations, but rather to define 



THE PAST AND PRESENT STATUS OF EXAMINATIONS 13 

objectives more clearly and to give teachers better 
training in making examinations which conform to 
these objectives and are not of such a nature that 
pupils who have not attained the desired goals to a 
fairly high degree will be able to make high marks 
thereon. 

Concerning the charge that examinations cause 
cramming there are at least two replies to be made. In 
the first place, effective cramming must be almost en- 
tirely upon facts. Therefore if the measurement of 
mere factual knowledge is reduced to its proper place 
as a minor function of most examinations, cramming 
will be automatically reduced until it becomes of little 
importance. Secondly, a certain amount of what may 
be called cramming is not altogether undesirable. It 
is a principle of learning that some time should be 
spent in reviewing what has already been studied. 
Therefore if examinations serve to stimulate more or 
less intensive and extensive review they are contribut- 
ing to a desired end. Certainly it is not generally de- 
sirable that pupils should cram upon materials which 
they should have been studying throughout a course, 
but have not, in the hope that by so doing they will be 
able to pass the final examination and thus receive 
credit. However, one may argue with some justification 
that it is better for pupils to do a certain amount of 
intensive cramming than not to study at all, that some 
of those who cram would not have studied during the 
term or semester even had they known that there would 
be no opportunity for cramming, and therefore that 
examinations should be given credit for causing at 
least some mental activity to be put forth. Further- 
more, from an entirely different standpoint, the ability 
to cram, to study intensively for a short period and 



14 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

hold in mind what has been studied, is a valuable one. 
The lawyer preparing a case, the physician in similar 
circumstances, and many others in their regular oc- 
cupations have need for just this ability. 

III. It is undeniably true that many teachers have 
had as their chief aim the mere preparation of pupils 
to pass examinations, especially those for eighth- 
grade graduation, college entrance, or some other crit- 
ical occasion. Since whether or not examinations be- 
come the chief objectives toward which pupils strive 
depends almost entirely upon how much emphasis is 
put upon them, such results have been practically in- 
evitable all too often. If pupils are given to understand 
that passing or failing is almost entirely determined 
by their showing upon fliral or other examinations, if 
they are constantly reminded that they must study 
this or that piece of work because they may be ex- 
amined upon it, if examinations are hold up as of out- 
standing importance and if failure to pass them is 
considered a serious calamity and disgrace, it is in- 
evitable that the attitudes and results which ensue will 
not be the best, j^n the other hand, if pupils are led to 
regard examinations as integral parts of the courses in 
which they are given, if the marks for the whole term 
or semester depend upon examinations to only a rea- 
sonable degree, if tests are of such a nature that con- 
sistent and regular study and class work is evidently 
the best preparation for passing them, examinations 
are not only not harmful objectives, but even distinctly 
desirable ones. Pupils should be in readiness to call to 
mind and make use of what they have learned and 
to do their best on occasions demanding unusual effort 
and maximum results. Moreover it is true that if ex- 
aminations are so constructed as to cover what has 



THE PAHT AND PRESENT STATUS OP EXAMINATIONS 15 

been taught adequately, the objective of being able to 
pass them is very similar to that of having mastered 
the content of the course, an aim which certainly can- 
not be classed as undesirable. It is only when too much 
emphasis is placed on examinations, when they are 
made so that they lend themselves too readily to cram- 
ming, when a final examination is considered to be the 
last occasion upon which the subject-matter of the 
course covered will be needed, that the objective of 
passing examinations becomes markedly and unde- 
sirably ditferent from that of mastering content. 

IV. Pupils do bluff and cheat upon examinations, 
but they likewise do so in their daily work, in their 
play, in their business dealings, and in various other 
activities, and yet it is not argued that pupils should 
be prevented from working, playing or having busi- 
ness dealings with others. Bluffing may be largely, if 
not entirely, done away with by the construction of ex- 
aminations upon which it is difficult to bluff and by 
marking which allows no credit for attempts to do so. 
To accomplish this requires critical and careful judg- 
ment on the part of the teacher. Whether or not pupils 
cheat depends up^n three, or perhaps more, elements 
in the situation.*'These three are the moral training 
they have received and the ideals they possess, the 
apparent desirability of the end to be gained by cheat- 
ing, and the probable chance of being able to do so 
without being detected. If the general atmosphere with 
which pupils are surrounded, especially that of the 
school, is unfavorable to cheating, so that they feel 
the loss of self-respect from doing so is greater than 
any possible increase in their marks, if examinations 
are not overemphasized and thus too great a premium 
placed upon cheating, and if it is not made easy to 



16 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

cheat, one need not fear their doing so any more dur- 
ing examinations than during any other activity in 
which they have a chance to gain something by so do- 
ing. 

V. It is undeniably true that such emphasis is often 
put upon speed and amount written, whereas good 
form is neglected, that poor handwriting, punctuation, 
sentence construction, and so forth are not only over- 
looked and allowed to pass uncorrected, but perhaps 
even developed. Instead of this being necessary just 
the opposite result should follow and does when the 
matter is properly handled. Indeed, as will be brought 
out later, one of the functions fulfilled by written ex- 
aminations is the development of good habits of lan- 
guage usage and ability in expression. The chief 
prerequisites to accomplishing this result are that 
sufficient time be allowed, high standards along those 
lines be maintained, and emphasis placed on the de- 
sired ends. 

VI. The argument that the time devoted to examina- 
tions can be more profitably employed is undoubtedly 
valid in some cases, just as the same argument would 
be in the case of any other activity. That is to say, there 
are teachers who devote too much time to testing the 
work of their pupils and who would much better spend 
a portion thereof in other more suitable ways. On the 
other hand, there are also teachers who allot too little 
time to this purpose. The proportion of the total time 
of any class which should bo used for examination 
purposes varies with the subject or phase of the sub- 
ject taught, the previous preparation and maturity of 
the pupils, the instr.uctional methods employed, and 
other factors, so that no exact or even approximate 
rule should be laid down. In .*). relatively small class 



THB PAST AND PRESENT STATUS OE EXAMINATIONS 17 

the teacher can more easily keep in touch with the 
work of each member through other means than formal 
tests than is possible in a large class, so that the 
amount of time which should be devoted to examining 
is less in the former case than in the latter. Also if 
a teacher is skilful enough to motivate the work of her 
pupils to a sufficient degree by other means than check- 
ing up on their work and appealing to their desire for 
high marks, less testing will be needed than if it 
must be employed for this purpose. Theoretical con- 
siderations, the opinions of practically all those best 
qualified to speak, and experimental evidence are 
united in support of the conclusion that in almost all 
cases pupils who are tested from time to time reach 
higher standards of achievement than do those who 
arc not so tested. In other words, the use of a moder- 
ate and reasonable amount of time for examination 
purposes is justified by the increase in pupil achieve- 
ment which results. 

VII. In replying to the next objection, that the 
products of education are intangible and cannot be 
measured, one must again admit that there is some 
truth in the assertion and that they are relatively 
intangible as compared with those of industry and sci- 
ence, for example. The difference, however, is one of 
degree and not of kind. In many lines of work, as well 
as in education, it is impossible to measure results 
with a high degree of accuracy and completeness. In 
the past this same condition held in what are now con- 
sidered the exact sciences, and even now the accuracy 
of measurement therein is not absolute, but only rela- 
tively high. This fact should therefore not lead to 
abandoning the attempt to measure results in the field 
of education, but rather should stimulate renewed ht- 



18 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

tack upon the problem and the development of im- 
proved measuring instmiments. Although it is prob- 
able, indeed practically certain, that measurements of 
mental accomplishment and growth will never attain 
the same degree of objectivity,® reliability and valid- 
ity ® as those of height, weight, area, temperature, and 
so forth, yet on the other hand it is very likely that 
they will be developed to a much higher degree of per- 
fection than they possess at present. Moreover, even 
relatively inaccurate measurements are for most pur- 
poses preferable to none at all, provided they are used 
with due recognition of the errors present and thus 
too great confidence in them avoided. 

VIII. In answer to the objection that examinations 
are unnecessary, particularly for purposes of measure- 
ment, the chief point to be made is that practically all 
experimental evidence bearing upon this contention 
shows that it is false. It is probably true that a few of 
the very best teachers can estimate the achievements 
of their pupils so well as not to require formal exami- 
nations for this purpose. It is also likely that there 
are a few teachers who can and do motivate and in- 
struct so efficiently that they do not need examinations 
for purposes of stimulating pupils or of discovering 
the weak points in their own instruction. Much evi- 
dence has been accumulated, however, to show that if 
such individuals do exist they are very few in number 
and that practically all teachers need to make use of 

Ohjectitfity may bo definod as tbat characteristic or auallty of a moasuriog 
instrument which causes it to yield the same results regardless of the personal 
equation or subjective influence of the person giving and scoring it. Xn other words, 
a test is objective if there is agreement among all competent scorers as to the 
correctness or incorrectness of all possible answers. For farther discussion, see 
p. 40. 

0 A measuring instrument is said to possess mlidlty when St performs Its stated 
function, that is, when it measures what it claims to measure. 'Also see p. id. 



THE PAST AND PRESENT STATUS OP EXAMINATIONS 19 

formal examinations as one of the bases of judging the 
■work of their pupils, as well as of increasing their own 
instructional efficiency. 

In concluding the defense of examinations, it seems 
appropriate to quote briefly from an able, thoughtful 
and experienced educator, President Lowell of Har- 
vard (46). He writes, with reference to examinations, 
. upon no part of the educational process can 
time and thought be better spent.” More or less of the 
same idea in addition to others is expressed again, as 
follows: “the conclusion ... is that examinations 
properly used are a vital part of the -educational proc- 
ess, but that the art of using them to produce the best 
results is highly complex and difficult.” 

4. Advantages of standardized tests over examina- 
tions prepared by the teacher. The statement has al- 
ready been made that standardized tests and examina- 
tions should be recognized as two complementary parts 
of a complete testing program and not in any sense as 
opposed to each other. Each possesses certain advan- 
tages over the other and has disadvantages compared 
with it, and each has its place which the other cannot 
fill adequately. As will be shown in more detail later 
it is true that ordinary examinations can be improved 
by applying certain of the principles follow;ed in the 
construction and use of standardized tests, but this 
does not mean that they can be so improved as to 
eliminate the need for the latter nor that they should 
be abolished and the latter substituted in their place. 

Perhaps the most 'important advantage of stand- 
ardized tests over examinations is that norms ^ are 

TA norm U a Atatemeut of the arerage achievement of a group of pupils who 
are in the same grade, of the same age, of the same mental ability, or who for 
some other reason may be considered homogeneous with regard to the achievement 
expected of them. Xt should be noted that it differs from a standard in that it is an 



20 TRADITIONAL, EXAMINATIONS AND NEW-TYPE TESTS 

available for the former, and thus one can know the 
average score of pupils in a given grade, of a given 
age, or of some other specified homogeneous group. The 
existence of norms permits a relatively accurate com- 
parison of pupils’ achievements with those of a large 
group of pupils of the same grade, age, or other 
common characteristic. Such a comparison is fre- 
quently of marked advantage in classifying and pro- 
moting pupils, in rating teachers and schools, in de- 
termining time allotments, and in other educational 
activities. It is thus rendered much easier for a school 
system to be reasonably sure that it is in accord with 
the general practice, so that completion of the sixth 
grade, for example, represents approximately the 
same stage of educational advancement as in other 
systems, or that rating a teacher’s efficiency as supe- 
rior means about the same as does a similar rating else- 
where. It should be pointed out, however, that the 
advantages due to having established norms are not 
as great as might appear upon first thought. The 
published and generally available norm.s for stand- 
ardized tests are usually very general, being averages 
based upon scores from many parts of tlie country 
and many types of schools and school pupils, and there- 
fore in any particular situation it may bo less sig- 
nificant to compare the achievements of certain pupils 
with such general norms than with those established 
by testing a more narrowly homogeneous group. In- 
deed, comparisons with an undefined group may be so 
misleading as to be worse than none. For example, the 
pupils coming from the homes of recent immigrants 
where little English is spoken cannot reasonably be 


indication of what pupils actually achieve, whereas the latter Is a goal of attain* 
mmt which they should reach. 



THE PAST AND PKESENT STATUS OF EXAMINATIONS 21 

expected to come up to the average performance of 
all pupils in most subjects, whereas those whose par- 
ents are native-born and of the higher professional 
classes should exceed this. Likewise general norms 
make no allowance for dijfferences in the content of 
what is taught or of the time given to teaching it in 
different schools. 

Another advantage of standardized tests is that they 
are usually constructed by persons who are more ex- 
pert, both from the standpoint of knowing the subject* 
matter and from that of understanding the principles 
of test construction, than the average or even superior 
teacher. Also more care and time are generally de- 
voted to determining the content and formulating the 
exercises actually used in standardized tests than is, or 
can well be, given to the construction of an ordinary 
examination. Again in this case, however, the advan- 
tage is not unalloyed. A standardized test is generally 
based upon a consensus of opinion or practice or some 
other investigation of general conditions and therefore 
frequently is not well adapted to local variations, 
practices and needs, or to the particular emphasis 
given by tho individual teacher. On the whole, how- 
ever, it is true that standardized tests are in closer 
agreement with commonly accepted and desirable ob-' 
jectives and cover the fundamentals of most subjects 
more thoroughly and with a better distribution of em- 
phasis than do most examinations prepared by 
teachers. Therefore they should be employed and sup- 
plemented rather than not used. 

One of the most commonly claimed advantages of 
standardized tests over examinations is that they are 
more objective, reliable and valid. On the whole this is 
undoubtedly true, but two provisos should be noted. In 



22 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

the first place, this advantage holds chiefly when stand- 
ardized tests are compared with ordinary discussion 
examinations ® and not nearly as much, indeed often 
not at all, when they are compared with the newer type 
of so-called objective tests,® which have many of the 
attributes of those that have been standardized- Id other 
words, teachers may apply certain of the principles 
followed in making and scoring standardized tests to 
those which they construct for their own use and thus 
obtain some of the advantages possessed by the for- 
mer, incidentally avoiding some of the disadvantages 
also. 

In the second place, it has been shown that if tradi- 
tional examinations are made, given and scored accord- 
ing to our best knowledge of how to do so they are not 
as inferior to standardized tests in objectivity and 
reliability as has been commonly thought. Because of 
the more or less common opinion to the contrary, what 
is probably the most convincing investigation deal- 
ing with this question will be described in some detail. 
This study, which was made by Monroe and Souders 
(6i, pp. 27-42), included results from sixty-six groups 
of children to whom traditional examinations had been 
given. The preparation of the examinations and the 
scoring of the answers thereto are described in the in- 
structions sent to the teachers who cooperated in the 

a The term discttssiorit traditional, or essay in applied to a ‘written examination 
of the type in common nse for many years, the kind in which pupils are asked 
to discuss, explain, describe, summarize or do something else which requires the 
writing of sentences, paragraphs, or longer units. 

OTho expression ohjsctivs tests is frequently applied to those yarieties of tests 
which are also known as new-type tests and collectively as the new examination. 
Their distinguishing features are that the pupil responses called for are very short, 
being check marks, underlining!, crosses, figures, single words or other responses 
which require a minimum of writing, and that they generally possess rather high 
ohiectivity. For further discussion, see p. 175, 



THE PAST AND PRESENT STATUS OF EXAMINATIONS 23 

experiment, and since a knowledge of these directions 
is essential to a complete understanding of the .results 
obtained and conclusions reached they are quoted be- 
low. It will be noted that they provide two methods 
by which evidence as to the reliability of scoring was 
secured. The following is their complete text : 

“METHOD I 

“Two sets of examination questions are to be prepared by 
a single person, or two or more persons working together. 
Bach of the two lists should contain the same number of 
questions. There should be a distinct effort to make the two 
lists approximately equal in difficulty and as nearly as pos- 
sible similar in respect to the type of questions. 

“After the two lists of questions have been made both 
should be given by each teacher to all of her pupils under 
as nearly the same conditions as possible. If not given on the 
same day, the two examinations should be given within a 
period of one week. For example, if two sets of examination 
questions in seventh-grade geography have been prepared, 
both sets of questions should be given by each seventh-grade 
teacher to all of her pupils. 

“Each teacher is to mark both sets of examination papers 
for her pupils. In marking these papers the teacher should 
indicate tlie credit given for each question and write the total 
grade plainly upon the examination paper. When two or more 
teachers have given the same examinations it is not necessary 
that they confer in regard to the marking of the papers. If 
this is done a memorandum regarding the procedure should 
be attached to the examination papers. 

“This method may be followed by a single teacher who has 
two or more sections of a given subject. Both examinations 
should be given to all sections taught by this teacher. This 
method of studying the reliability of written examinations 
can be. applied to any school subject. The Bureau of Educa- 
tional Eesearch is most interested in having it applied to 
arithmtetic, history, geography, and language in the ele- 
mentary school and to history, mathematics, English, and 
science in the high school. 



24 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


“METHOD II 

“Two sets of examination questions for the same subject 
are to be prepared by two teachers working independently, 
each teacher preparing a set. There is no requirement con- 
cerning the length or the difficulty of the two sets of examina- 
tion questions except that both should cover the same amount 
of work. The teachers who prepare the questions should not 
confer concerning either the kind or the number of questions 
asked. 

“After the questions have been prepared, both sets are to 
be given by each teacher to all of her pupils. If not given 
on the same day, the two examinations should be given 
within a period of one week. 

“After the examinations have been given each teacher 
will grade all of the papers written upon the questions that 
she prepared. This will mean that she will grade a set of 
papers for her own pupils, and also a set for the pupils of the 
other teacher. There should be no conferring between the 
teachers in regard to the method of scoring. In marking these 
papers the teacher should indicate the credit given for each 
question and write the total grade plainly upon the examina- 
tion papers.” 

No other instructions than those just quoted were 
given. The subjects in which the examinations were 
used numbered fourteen and, with the exception of 
reading, included those most commonly taught in ele- 
mentary and high school, also one or two others. The 
coefficients of correlation or reliability “ between the 
two ratings of the pupils’ answers varied from — .20 

10 A coeficient of correlation is a numerical exproBsian which summarizes tho 
degree or closeness of relotionshlp between two series of correspontlinir scores or 
measures of tho same cases. Its value ranges from +1.00, which shows perfect 
positive or direct relationship, through .00, which indicates no relationship at all, 
to — 1,00, which signifies that tho relationship is perfect hut negntive or inverse. 

11 A coofllcicnt of correlation between two series of scores of the same Individ* 
uals on the same test is called a coejfflcUnt of reliaWty, slnco it measures the 
reliability of the test. 



THE PAST AJSD PRESENT STATUS OP EXAMINATIONS 26 

up to + -SSj that is, from a slight tendency for the 
higher marks given by one teacher to correspond to the 
lower ones assigned by another to a fairly close 
approximation to perfect agreement. The median 
coefficient was .65. Monroe and Souders compare this 
figure with the median of a number of coefficients of 
reliability for standardized tests, which was .75, the 
range being from .19 to .92. Therefore they conclude 
that in so far as reliability is concerned traditional ex- 
aminations may be almost as satisfactory as standard- 
ized tests. 

In actual practice it is probable, however, that the 
difference is greater than is indicated by the ffifference 
of .10 between the reliability coefficients of .65 and .75. 
There is a strong tendency for standardized tests with 
low reliability either to receive little use or to be modi- 
fied so as to improve this feature. Furthermore, it is 
likely that the examinations employed in this investi- 
gation were constructed and scored with considerably 
more care than is usually the case when teachers pre- 
pare and mark them. In spite of these facts, however, 
the evidence appears to justify the conclusion that the 
difference in reliability between the two t 3 rpes of 
measuring instruments is not as great as has been 
generally supposed. Moreover, it can be made so small 
as to be a matter of little consequence if examinations 
are made and scored according to the best known 
methods of so doing, methods which will be suggested 
later in this volume. 

From another standpoint, however, there is a more 


‘ 12 Tho median if} tho point on «ach sido of whicli half of the eases or measures 

in a distribution falL Thus, in the case referred to in the text, half of the co- 
efficients were above .65 and half below. 



26 THADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

important difference. The coefficient of reliability takes 
account only of the variable errors present in the 
scores and is in no way affected by the constant er- 
rors therein. In so far as the latter are concerned, 
standardized tests have a very decided advantage over 
ordinary examinations. It is true that the scores on the 
former may contain constant errors because too much 
or too little time has been allowed, too much help given, 
or for some other reason which tends to affect all 
pupils alike, but errors due to such causes are on the 
whole relatively small compared with those in examina- 
tion marks which result from some teachers being hard 
markers and others easy markers, from differences in 
the general state of their health and emotions, and 
from other causes which produce more or less similar 
effects upon the marks given all the papers of a class 
or other group of pupils. 

One advantage often claimed for standardized tests 
is that through their purchase and use the time devoted 
to preparing examinations, which is often consider- 
able, is saved for more profitable employment and that 
much less time is required to score them than is de- 
manded by ordinary examinations. That time is saved 
is, of course, true, but it may be questioned if devot- 
ing a reasonable amount of time to the preparation of 
examination questions is not fully justified by the re- 
sults. The careful, thoughtful preparation of a set of 
questions should lead to an evaluation of what has 
been attempted that is often of great benefit to a 

la VaHahle or accidental errors are those which differ for the various members 
of the group concernod. They are due to chance causes and usually result in 
some scores which aro too large and others which are too small. 

X4A constant error is one which is common to the scores of all the members 
of the group and produces errors which are all in the same direction and tend 
to he oquaI, either absolutely or relatively* 



THE PAST AND PRESENT STATUS OP EXAMINATIONS 27 

teacher. Perhaps a similar helpful result would ensue 
from the careful study and selection of standardized 
tests, but it seems very unlikely that it would be nearly 
as great. As to saving time in scoring, this is an 
advantage which is very marked when standard tests 
are compared with discussion examinations, but it 
becomes slight when they are compared with new-type 
tests. Even in the former case it is frequently found 
that teachers who are not familiar with the methods of 
scoring used in connection with standard tests waste 
so much time getting started that in the long run they 
save very little by their use. However, this is a condi- 
tion which less often exists as such tests come into 
wider use and as more teachers learn about them in 
their training and employ them in their instruction. 

5. Advantages of examinations prepared by the 
teacher over standardized tests. In discussing the ad- 
vantages of standardized tests over examinations some 
references have already been made to advantages of 
the latter over the former. Perhaps the chief one of 
these is that examinations can be adapted to local 
courses of study, to points of emphasis of individual 
teachers, to other variations from general practice 
which should be covered by the tests given, and to the 
needs of particular classes or even of individual pupils. 
Absolute uniformity is deadening and there is liable 
to be a tendency toward it when only standardized 
tests are employed. However, it is only fair to say that 
many, perhaps most, teachers are more likely to have 
their work improved by some such unifying influence 
than are likely to have it made less efficient. On the 
whole it is best to employ some tests which tend toward 
uniformity and others which allow for desirable local 
variation' and kdlaptation. 



28 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

A second disadvantage of standardized tests is that 
satisfactory ones do not yet exist in all subjects and, 
still more, in all divisions or phases of all subjects. 
Although the tendency to prepare standardized tests 
which cover small imits of subject-matter rather in- 
tensively appears to be growing, there are so far few 
complete series of such tests. Most standard tests are 
much more suitable for use at the completion of a 
course or a large unit thereof than over small topics or 
portions of subject-matter. Therefore examinations 
prepared by the teacher are needed to supplement 
standard tests even if the latter are rather largely 
used. 

Many of the standardized tests possess only one 
form and very few more than two or three. In some 
cases it is possible to test pupils with the same ex- 
ercises over and over again at rather short intervals 
with satisfactory results, but in most instances the 
practice effects are too great. Therefore standardized 
tests frequently cannot be used as often as tests of 
some sort should be given and so need to be supple- 
mented for another reason than that stated in the last 
paragraph. The remark should probably bo made that 
in many instances it is comparatively easy for well- 
trained and able teachers to construct tests similar in 
form and type of content to standardized ones, and 
that such tests may frequently well be used in place of 
duplicate forms of standard tests. For many purposes 
the fact that they may not happen to be of approxi- 
noately the same difficulty is unimportant. 

Partly because of, the already mentioned fact that 
standard tests are rarely closely adapted to lo^l con- 
ditions and partly for other reasons the results i^ecure^.- 
thereon are frequently difficult to interpret in the most 



THE PAST AND PRESENT STATUS OP EXAMINATIONS 

helpful ways. It has been shown above that comparison 
of the achievements of pupils with a general norm is 
often of little benefit and may even be misleading. 
Most teachers and practically all school patrons are 
much more used to thinking in terms of percentile or 
literal marking systems than of standardized test 
scores, even though the latter are transmuted into 
grade, age, or other equivalents and therefore the 
former conveys a much more 6asily..understood idea 
of pupil performance to them,. Jt is true that standard- 
ized test scores can be changed into such marks or, 
better yet, that both teachers and patrons can be 
educated to understand the significance of scores or 
their equivalent expressions, but the former requires 
a considerable amount of labor and the latter cannot be 
brought to pass at once. 

Another disadvantage of standardized tests is that 
usually one must determine just what tests are to te 
employed longer in advance than is always convenienl^ 
since otherwise the necessary supplies cannot be 
ordered and secured in time. Although this necessity 
may produce certain good effects by. requiring more 
careful consideration and planning of testing pro- 
grams and materials, yet there are situations and 
occasions when it would be very undesirable to have 
to make all such decisions far enough ahead to per- 
mit the obtaining of standardized tests. Unexpected 
exigencies sometimes arise which make it best to give 
a test or modify the content of one already planned 
upon rather short notice. 

A final disadvantage of standardized tests is their 

10 In connection with many standardized tests provision has been made for 
turning the scores actually made thereon into terms of what pupils of a given 
grade, age, or degree of mental ability usually accomplish or into other statistical 
equivalents such that all may be put upon the same basis. 



30 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

In some instances this is only one cent, or even 
less, per pupil, and it rarely exceeds five or six cents 
for a single test, but when this amount is multiplied 
hy the number of pupils and the resulting product by 
the number of tests to be used during a year, the 
amount becomes large enough that one may well hesi- 
tate over its expenditure. Certainly if absolutely all 
testing should be done by means of standardized tests 
any school system would find it very difficult to carry 
on an adequate testing program and to justify the ex- 
penditure required in view of the total amount of 
money available and the other needs of the system. 

6. Summary. Exanainations are not new but have 
been in use in many countries for many purposes dur- 
ing a long period of time. It appears to be only re- 
cently, however, that any considerable amount of criti- 
cism has arisen concerning their form and function. 
'Probably the chief factor in causing this criticism has 
been the discovery that marks upon traditional ex- 
amination papers appear to be decidedly unreliable. 
Despite the opinions of some persons that examina- 
tions of the ordinary sort should be abolished or per- 
haps superseded by standardized tests, the best way 
to remedy the situation is to do neither, but to im- 
prove examinations. There is some validity in all of 
the eight adverse criticisms most often made against 
examinations, but most of these points no longer apply 
if tests are properly constructed and administered. 
Even if some limitations and disadvantages are un- 
avoidable or practically so, this fact does not warrant 
discarding examinations. Standardized tests are in a 
number of ways preferable to teacher-made examina- 
tions and possess distinct advantages over them, but, 



THE PAST AND PRESENT STATUS OF EXAMINATIONS 31 


on the other hand, examinations also have a number of 
points in their favor. Therefore a well-rounded test- 
ing program includes the use of both of these forms of 
measuring instruments. 



CHAPTER II 

WHAT ARE GOOD EXAMINATIONS? 


1. The purposes examinations should serve. In order 
to determine the merit of an examination one must first 
consider the purposes it is designed to serve and then 
rate it in terms of how well it does so. Numerous state- 
ments have been made as to the purposes which ex- 
aminations should accomplish. Although many of these 
statements have been entirely inadequate or other- 
'wise unsatisfactory, a number have been excellent and 
it seems worth while to mention two of the briefer ones, 
also one not so short. Brinkley (7, p. 2) gives a defini- 
tion of a good test in which he states that it is one 
which) with a minimum expenditure of the time and 
energy of both pupil and teacher, (1) gives an exact 
and accurate measure of the trait which it is desired 
to measure; (2) makes possible a diagnosis of individ- 
ual weakness and strength of pupils, of the adaptabil- 
ity of the subject-matter, and of the suitableness of the 
methods used; (3) stimulates correct study and habits 
of study; and (4) stimulates and directs proper 
methods of teaching. The other of the two short state- 
ments referred to is by Dadourian (16), who gives the 
following four principal objects to be attained: 

“1. To determine the degree of the grasp of the subject 
by the .student. 

2. To give the student a chance to better his knowledge of 
the subject through review. 

32 



WHAT ARE GOOD EXAMINATIONS? 


33 


3. To give the student a taste of concentrated effort at 
logical thinking and at clear and connected expression. 

4. To give the student an opportunity to look at the sub- 
ject as a whole, to enable him to see the ‘forest’ under whose 
trees he has been wandering during the term.” 

It will be noted that the first of these statements 
differs from the second in that the last purpose given 
by Brinkley, that of helping the teacher, is not men- 
tioned by Dadourian. All of his points as well as Brink- 
ley’s first three deal with the matter from the pupils’ 
standpoint. Otherwise, the content of these two sets 
of four each is practically the same, although the divi- 
sion into points and the wording differ. 

A longer and somewhat more detailed list of aims 
than either of those quoted above is frequently given. 
One of the best of such lists is that of Symonds ( 85 , 
pp. 1-2).^ Although he states these as being purposes 
of measurements, under which he includes both ordi- 
nary examinations and standardized tests, yet they 
may well be given as applying to the former ,alone. 
They are as follows : 

“1. To inform pupils of their achievement. 

2. As incentives to study. 

3. To promote competition 

a. Between groups ; 

b. Between individuals ; 

c. With one’s past record. 

4. To determine promotion. • 

5. To diagnose weak spots in the pupil’s achievement. 

6. To determine the quality of instruction. 

7. To determine admission to high school. 

8. To place a pupil in the school. 

1 From S;iaond«f KacMurment in Stcondarp Mducationt pp: l’-2. By permission 
of Th« Macmillan Company, publisliars. 



34 TRADITIONAL EXAMINATIONS AND NEW-TYPB TESTS 

9. To determine admission to college. 

10. To provide reports to parents. 

11. To determine credits, honors, etc. 

12. Educational and vocational guidance. 

13. To rate teachers. 

14. To predict a pupil’s success. 

15. To study the efficiency of the school.” 

It will be seen that all or practically all of the fifteen 
purposes which Symonds names may he grouped under 
the four more general ones in either of the lists pre- 
viously given. The same is true of many of the lists of 
aims of examinations given by others. That is, they 
tend to be similar although some arc relatively short 
and others long. The differences that do exist among 
various writers have not generally resulted from 
marked disagreements in their beliefs as to the proper 
functions, but rather from their different points of 
emphasis and from the fact that they are thinking of 
different kinds of examinations. It should be clearly 
recognized that although some of the desirable func- 
tions of examinations may be common to all forms 
and to all conditions of use, others cannot bo expected 
to ensue except under certain conditions. Thus all the 
aims accomplished by announced and unannounced 
tests are not the same, neither are all those of tradi- 
tional examinations and objective tests or oven of par- 
ticular varieties of one or the other. 

As a result of the inspection of many statements 
of the aims of examinations prepared by others and 
also of his own thinking, the writer wishes to suggest 
a list of six chief functions which examinations should 
fulfill. There is some overlapping between the aims 
named below and some of them might perhaps be 



WHAT ARE GOOD EXAMINATIONS t 35 

further divided, but the list as given appears most 
satisfactory. The six are as follows: 

I. The Daeasurement of pupil ability and aceompifishiaent. 

II. The diagnosis of pupils, especially of those doing un- 
satisfactory work. 

III. The measurement and improvement of teaching ef- 
ficiency. 

IV. The provision of opportunities for learning. 

V. The motivation of pupil study and other mental ac- 
tivity. 

VI. The determination of standards or goals of attainment. 

I. .The first of the purposes just named, that of 
measuring pupil ability and accomplishment, is not 
only justifiable but important, yet as compared with 
the other aims it has and still does receive entirely too 
great an amount of emphasis. It is usually thought of 
as the primary function of examinations, and as they 
are employed by most teachers it is much more promi- 
nent than any other. The measurement of pupils’ 
stocks of memorized facts, separate and related, the 
determination of how much they know and how well 
they know it, the testing of their powers of application, 
organization and reorganization, are not only useful 
but indeed essential to educational efficiency. The re- 
sults of such measurements are very helpful in assign- 
ing marks to pupils, in determining their fitness for 
promotion, in classifying them within a grade, and for 
various other purposes. Teachers’ estimates might be 
used for these purposes, but they are generally un- 
reliable and not remembered well enough upon the 
occasions when they are to be used. Teachers tend to 
remember estimates based on the first, last, or most 
striking impressions made by pupils rather than on the 



36 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

sum total of their achievement, and therefore base 
their ratings on only a small portion of pupils’ total 
■work. Just as those interested in the physical develop- 
ment of children make frequent and careful measure- 
ments of weight, height, and other characteristics, so 
those interested in their mental growth should do the 
same in that held. 

II. The second purpose, that of diagnosis, might be 
classed under the first, but nevertheless seems im- 
portant enough to be considered separately. The chief 
purpose in measurement for diagnosis is not to de- 
termine what marks pupils should receive or whether 
or not they should be promoted, but i'ather to find out 
nust what instruction is most needed/It centers atten- 
tion upon what pupils have not learned and why they 
have not done so. One may think of a diagnosis as an 
inventory of present status plus an investigation of 
past causes of this status with a view to determining 
what should be done to improve it. Diagnosis should 
reveal differences not merely in the subject-matter 
learned by various pupils, but also in their abilities 
and capacities, their attitudes, interests, and other 
characteristics which affect success in school. It should 
have as its aim definitely remedial work and unless it 
is thus followed up is scarcely if at all worth the time 
and trouble required. This remedial work should not 
be limited to the narrow field of particular school 
subjects, but have a much broader scope, including 
educational and vocational guidance. 

III. Measuring and improving teachers’ efficiency 
may likewise be thought of as a subdivision of the first 
aim, that of measuring the achievements of pupils, and 
is also closely connected with diagnosis, but is given 
separate mention because it should receive especial 



WHAT ARB GOOD EXAMINATIONS? 37 

emphasis. The abilities and achievements of pupils 
need to be measured so that teachers may know how 
effective their own work has been. Such measurements 
help teachers to judge whether or not the best possible 
selection of subject-matter has been made, to deter- 
mine if the best methods have been employed in pre- 
senting it, and so on. Unless teachers know rather 
definitely what they are accomplishing it is very dif- 
ficult for them to make any improvement and still 
harder to determine how much, if any, has been made. 
Measurement for this purpose serves to stimulate both 
inferior and superior teachers. The former can be 
made to realize the need for improvement and the 
latter encouraged to further efforts by the knowledge 
of their success in the past. Both will do better work 
through being sure that their efforts are resulting in 
improvement. 

IV. The purpose of learning on the part of the 
pupils is an important function of examinations which 
is very often almost or even entirely neglected. Writ- 
ten examinations as usually given may provide learn- 
ing exercises for some pupils, but rarely do so for a 
large number and much of the material in them con- 
tributes nothing to this end. Perhaps all of it cannot do 
80 and at the same time accomplish other desirable 
functions. What is chiefly needed, however, appears to 
be the recognition of this function and the adoption of 
the proper attitude on the part of the teachers rather 
than any great modification of the types of material 
included. Tests can rarely if ever be expected to pro- 
vide satisfactory opportunities for original learning, 
that is, the learning of something not before encoun- 
tered. They can and should, however, provide many 
chances of relearning, of review and recall, of practice 



38 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

and drill, of emphasizing and fixating facts, of organiz- 
ing and reorganizing what has already been learned, 
of getting a view of a large unit of subject-matter as 
a whole, and of other mental activities which constitute 
goals of instruction. To choose only one example, the 
power to recall quickly, completely and accurately what 
one knows is of high value and worth developing even 
at the cost of considerable time and effort. The very 
fact that examinations are usually periods of more 
intense concentration than the time devoted to study 
should make them more effective in fixing in mind 
what is recalled and used in answering the questions 
which compose them. Furthermore, discussion ex- 
anoinations and perhaps some of the newer types may 
furnish training in expression. This should not be 
thought of as a primary function, but rather as one to 
be secured whenever possible without interfering with 
the more important ones. Perhaps still more important 
than anything mentioned so far in this paragraph is 
the habit of working under pressure and relying upon 
one’s own resources while doing so. If examinations 
can increase to any considerable extent the power to 
work independently, to be able to know and use with 
confidence what has already been learned, to summon 
one’s highest powers in a crisis demanding their iise, 
they justify their existence regardless of whether or 
not other desirable results are consequent upon their 
use. 

V. The fifth purpose mentioned, that of motivating^ 
or stimulating pupils’ work, is one which would be 
unnecessary in an ideal class-room situation. Most 
readers undoubtedly will agree that it is highly de- 
sirable for pupils to study and master their work 
chiefly for the satisfaction which comes from mental 



WHAT ARE GOOD EXAMINATIONS? 39 

accomplisliiaent and, to a lesser degree, for the future 
benefits to be derived from such mastery, but they will 
also probably agree that it is practically, if not quite, 
impossible to secure a satisfactory degree of effort 
on the part of pupils when only such ideal motives 
are employed. In practice, not only school pupils, but 
individuals of any age and in any walk of life, will al- 
most always do better work if they know that what 
they do will be inspected or rated from time to time 
and that definite advantages or satisfactions will result 
from their receiving satisfactory or superior ratings. 
It is possible that the time may come when most 
teachers will be able to secure satisfactory work from 
pupils without using tests to aid in so doing. At 
present, however, there are very few if any who can 
do so and this condition will in all probability continue 
to exist for many years in the future. Very few in- 
dividuals, in or out of school, will put forth anything 
approaching their best efforts unless they have some 
concrete and more or less material end in view. Appeals 
to far-away needs and values tend to be too abstract, 
indefinite and uncertain to be held in mind steadily by 
the majority of adults and still less by most elemen- 
tary and high-school pupils. Pew experienced teachers 
will deny that tests do hold pupils down to work and 
that the knowledge that a day of reckoning is to come 
stimulates greater activity and therefore more learn- 
ing. This is especially true with regard to study for 
permanent or semi-permanent retention rather than 
merely for keeping certain facts in mind until the next 
class period or some other time in the immediate 
future. Furthermore, practically all pupils derive defiT 
nite feelings of satisfaction from showing that they 
know and are able to make use of what they have 



40 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

studied. Because of this fact, tests and examinations 
upon which they are able to make creditable scores are 
a source of stimulation provided the other conditions 
under which they are taken do not make them too un- 
pleasant and disagreeable. Thus exanoinations stimu- 
late indolent and inferior pupils by acting as prods 
to better work and studious and superior ones by af- 
fording an outlet for expression. 

VI. The setting of standards or goals of attainment 
presupposes measurement. The results of satisfactory 
written tests show with some definiteness and accuracy 
what pupils of a given grade, age, or other homogene- 
ous group are doing and thus enable standards of what 
they should do to be set up. It is difficult to do efficient 
work unless one has in mind the definite goals toward 
which he is working and knows approximately how 
well these goals are being attained. Therefore this 
function has a significant place in promoting teacher 
efficiency, as well as in connection with the promotion, 
classification and marking of pupils. 

. 2. The qualities of good examinations. In the light 
of the foregoing purposes which examinations should 
fulfill a number of qualities may be named as essential 
or desirable. In some instances the mere statements of 
the qualities imply how they can be attained, in others 
this is not true. Therefore both here and in subsequent 
chapters suggestions will be given which will aid in 
securing these qualities. The critical reader will notice 
that there is some overlapping between the different 
qualities and principles stated below, but this has been 
allowed in preference to running the risk of not em- 
phasizing certain points sufficiently. 

A very important quality which should be possessed* 
by a test in as great a measure as possible is objectiv- 



WHAT AEE GOOD EXAMINATIONS? 41 

itj:~ According to the definition already given, a test 
is objective when scores given by different persons 
are exactly the same. Another way of expressing this 
is to say that a test is objective when there is no doubt 
in the minds of competent persons as to what answers 
are right and what are wrong. All subjective elements, 
such as personal opinion and judgment, should be 
eliminated. A test is commonly said to be perfectly 
objective if this condition holds for every item or ex- 
ercise which it contains. It is true that in practice 
the scores given by different persons do not agree 
exactly because of occasional errors made in scoring, 
but if they do agree when these errors are corrected 
the test concerned is considered perfectly objective. 

Another important point in judging the merit of an 
examination or test is its re liabili ty. This may be de- 
fined as the accuracy with which a test or examination 
measures whatever it does measure. Another and 
probably more common statement of the meaning of 
reliability is that it is the degree to which scores made 
upon a test at one time agree with scores made by the 
same pupils upon the same test ® at another time. If 
the agreement is absolute the test is perfectly reliable ; 
if not, the degree of reliability is the degree to which 
such agreement approaches completeness. There is 
no contradiction between these two definitions, since 
they merely emphasize different aspects of the question. 
Sometimes the expression self-correlation is used in- 
stead of reliability. Both are essentially synonymous 
with accuracy of measurement. 

a The expression same test should be interpreted to include not merely an iden- 
tical test, but also a similar and duplicate test, that is, one which does not con- 
tain just the same items or exercises, but which does include the same number 
thereof, presented in the same form, covering the same subject-matter, and pos- 
sessing the same difficulty. 



42 TEADITIONAL EXAMINATIONS AND NEW-TTSPB TESTS 

Unless the test in question includes and measures 
all of the subject-matter to be covered, it cannot be 
perfectly reliable. For a limited number of facts or 
other items, such, for example, as the sums or products 
of all the single digit combinations, the spelling of a 
particular list of 50 or 100 words, or the memorizing 
of a definite list of dates in history, it is generally 
practicable to make tests which include every item and 
which therefore may be perfectly reliable. However, 
even such tests are rarely if ever perfectly reliable in 
the full sense of the word because what they are de- 
signed to measure is not merely the pupils’ responses to 
the test exercises, but the knowledge or ability which 
lies back of those responses. Since this knowledge or 
ability, or at least the power to express it, fluctuates 
from time to time in individual pupils, there is almost 
never absolute agreement between the scores on even 
identical tests given at different times. In most cases, 
however, the measurement of pupils’ mastery of a 
certain body of subject-matter is attempted by means 
of exercises which only sample it. Since the total 
content is much too great to be covered in a single 
examination, the degree of reliability thereof depends 
largely upon how good a sample of the whole is in- 
cluded in the test. Other things being equal, the larger 
the sampling, the higher the reliability. Thus by in- 
creasing the length of a test, reliability is usually 
increased. This increase may be accomplished by add- 
ing more items to those already included, thus requir- 
ing more time, or by substituting a larger number of 
shorter items for fewer longer ones without lengthen- 
ing the time needed. 

There are a number of statistical measures used for 
measuring and stating the reliability of tests. The most 



WHAT ARE GOOD EXAMINATIONSt 43 

common of these will be very briefly discussed here 
and several others mentioned, leaving the reader who 
wishes more extended treatment to consult some other 
source.* Probably the most usual means of measur- 
ing and describing the reliability of a test or examina- 
tion is by means of the coefl&eient of reliability or of 
self -correlation, which is merely the coefficient of cor- 
relation between the scores made by the same pupils 
when taking the test at two different times. A coeffi- 
cient of reliability of -(- 1.00 indicates absolute agree- 
ment or perfect reliability. From this the coefficient 
may range down through zero, which indicates no agree- 
ment at all, to — 1.00, which indicates the most ex- 
treme disagreement possible or perfect negative cor- 
relation. A coeflficient of reliability of .90 or above is 
usually considered decidedly high and any test with 
such a coefficient is sufficiently reliable to warrant 
its use. Indeed, many of the widely used standard- 
ized tests as well as most ordinary, tests and exami- 
nations have/ reliability coefficients) decidedly below 
.90. ' ' 

Several other means of measuring and stating the 
reliability of test scores have been more or less widely 
used in connection with standardized tests. Among 
these are the coefficient of alienation,^ the probable 

8 Most of the textbooks on educstional statistics and some of those dealinsr-with 
tests and measurements treat this question at some length. The following are 
especially recommended : 

Odell, 0. Bthicational Statiatiee. New York: The Century Co., 1926, pp. 
185-180, 230-241. 

Monroe, W. S., Tntrcduction to tho Theory of Educational Meaeuremente, 
Boston; Houghton Mifflin Company, 1928, pp. 201-219, 350-854. 

Buch, G. M.r and Stoddard, G. !>., Teste and Meaeuremente in Hiyh School 
tnetruoiion, Yonkers: World Book Company, 1927, pp. 855-874, 

4 The coefficient of alienation, abbreviated' k, may be defined as the ratio of 
the size of errors Involved in test scores to the variability of the scores them- 
selves. . It is eompu,ted by the .formula Jc =r Vl — rV in which r is the coefficient 
of correlation between the two series of scores, concerned. . 



A& TEADITIONAl EXAMINATIONS AND NEW-TYPE TESTS 


errors of estimate® and of measurement® and the 
ratios of these probable errors to the mean or stand- 
ard deviation.® Although the meaning of these is in 
many ways more concrete than that of the coefficient 
of reliability and although they possess certain advan- 
tages in other respects, the coefficient of reliability is 
recommended for ordinary use by the class-room 
teacher in judging and comparing the reliability of 
different tests. The other expressions all require some- 
what more computation and it is doubtful if they 
justify the additional labor involved when only a small 
number of students and an ordinary examination are 
in question. Anyone who wishes to do careful experi- 
mental work along this line should become acquainted 
with them by consulting the references given above 
or other similar ones. 

Since, as has been stated, increasing the length of 
a test increases its reliability, coefficients of reliabil- 
ity should be compared with this fact in mind. If the 
same form and type of content are preserved when a 
test is lengthened, the coefficient of reliability of the 
lengthened test will have the same ratio to that of the 
original test as the square root of the ratio of their 
lengths. Thus doubling the length of a test makes it 


s The pfobiible error of estimate is a measure such that half of the differences 
between two series of scores are less than it and half are greater. There are 
several formulas by which it may be found, of which the most frequently used is 


probably P. E. 


est. 


=: .6746 <rVT=7. 


meas. 


e The prohdblo error of measurement is similar to the probable error of estimate 
except that It applies to the differences between a series of actually obtained scores 
and the theoretically true scores. The most common formula is P. 

.6745 

7 The mean is the same as the ordinary arithmetic average, that is, the sum 
of all the scores or other data divided by thdr number, 

' 8, The standard deviation is a measure of the spread, scatter, or variability of 
n set of scores around their average. Its size is such that the differences between 
the scores and their average exceed it in slightly less thaU one*third of the cases 
and are smaller in about two-thirds. 



WHAT ARE GOOD EXAMINATIONS t 45 

approximately 1.4 times as reliable (V^= 1.4), mak- 
ing it four times as long renders it twice as reliable 
(Vl^==2), and so on. Tests containing material in 
iiifferent forms cannot be judged as to length by the 
number of items they contain. For example, the re- 
liability of an ordinary discussion examination and of 
a true-false test, or even of two new-type tests such as 
a true-false test and a multiple-answer test, caimot be 
so compared. A better and probably the best rough 
basis for judging their length is by the time they con- 
sume. Thus to be of equal merit with respect to reli- 
ability any two examinations which require that the 
pupils work the same length of time should possess 
the same coefficient of reliability, and the ratio between 
the coefficients of reliability of those requiring unequal 
periods of work should be proportional to the square 
roots of the lengths of time required. 

One of the most important qualities to consider 
in connection with a test or examination is that of 
mlidity . As the term has already been defined, a test is 
falid if it measures what it is intended to measure, 
t^alidity depends upon objectivity and reliability, the 
selection of exercises, and various other factors. Very 
’requently a test which appears to be a good test is 
:ound, upon careful examination, to measure some 
)ther ability than the one it is designed to measure to 
such a large degree as to impair its measurement of 
he latter. For example, an arithmetic test which 
sails for the addition or subtraction of single inte- 
yers or perhaps of even larger numbers has been 
shown, at least in the case of many bright pupils, to 
field scores which are largely measures of speed of 
vriting figures rather than of addition or subtraction 
ibility. In other words, bright pupils can add or sub- 



46 TRABITIONAL EXAMINATIONS AND NEW-TTPE TESTS 


tract more rapidly than they can write their answers 
and so the test does not reveal how rapidly they can 
perform those operations. All too frequently discus- 
sion examinations call for so much writing in such a 
limited time that pupils’ speed of writing and freedom 
from fatigue are measured rather than their ability 
in the subject dealt with. Another common cause of 
low validity is that exercises are not clearly stated. 
As a result, some pupils interpret them one way and 
some another, and thus all do not attempt to do the 
same thing. Sometimes the same condition holds with 
regard to general directions so that all pupils do not 
attack the whole examination in the same way. For ex- 
ample, if some pupils spend all their time answering 
the first two or three of ten questions very fully, 
whereas others write on all ten but much more briefly 
on each, the results are not valid for purposes of com- 
paring the pupils of one group with those of the other. 

The most usual way of expressing the validity of a 
test is by means of the coefficient of correlation be- 
tween it and whatever is used as the criterion meas- 
ure.* Unless the coefficient of .correlation or validity 
is rather high, that is, fairly close to -f- 1.00, the test 
is not considered to possess satisfactory validity. 

A good examination must be easy for the teacher 
to give and score, as well as for the pupils to take. 
This means that giving it should not require a great 
deal more labor and care on the part of the teacher 
than supervising any other work done by her pupils. 

oThd term criterion measure is very often epplied to the meeflure of the 
ability or trait in question whioh is assumed to be a true measure and with 
which the measures whose validity Is being determined are comparedi 

10 A cceficient of validity is merely a cooffllcient of correlation between test 
results and criterion measures. Such a coefficient is a measure of the validity of 
the test concerned. 



WHAT AKE GOOD EXAMINATIONS? 


47 


'Als o it should call for answers in such forms as can 
be recorded easily by the pupils. 

A good examination is also economical of the time 
of both teacher and pupils. In other words, the amount 
of time consumed in making the questions and other- 
wise preparing the examination, in giving it to the 
class, in having the members thereof write upon it, 
and in scoring, should not be excessive. No absolute 
standards or limits can be set, but the purposes of the 
exanaination, the amount covered, and the maturity 
of the pupils should determine the amount of time that 
can be profitably used. A maximum limit on pupils’ 
actual working time of perhaps ninety minutes for 
high school and from that down to fifteen or twenty 
minutes in the lowest elementary grades may be sug- 
gested. A teacher should generally be able to score a 
paper in not much over 5 per cent of the time required 
by the pupil to write it. 

An examination should be sufficiently interesting 
and otherwise stimulating to the pupils to call forth 
approximately their best efforts. They should feel 
that taking it is worth while as well as interesting. 

An examination should rarely contain “catch” ques- 
tions. Many people believe that they should never be 
used, but this Viewpoint seems somewhat extreme. 
An occasional question of this type may be inserted to 
test the clear thinking and care exercised by pupils in 
answering examination questions. Probably the best 
place in which to insert such occasional “catch” ques- 
tions is in rather long lists of exercises of the same 
general form. Thus in a list of multiple-answer ex- 
ercises, one may be inserted for which all the given 
answers are incorrect, a fictitious name may be placed 
in a list of cities about which a certain fact such as 



48 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

• ' . 

location is asked, or such a question as “In what year 
was Clay elected President of the United States!” 
may be inserted in a list of questions asking for dates. 

If an examination is given primarily for purposes of 
obtaining accurate and relatively complete measures of 
pupils ’ ability, it should be easy enough that all those 
taking it will be able to answer some of the questions 
and thus make scores above zero, and hard enough 
that no pupils will be able to earn perfect scores. Un- 
less this principle is followed, the abilities of the best 
and the worst in the class are not adequately meas- 
ured, as their limits are not reached. An exception 
should be made to this principle in the case of tests 
covering minimum essentials, specially assigned lists 
of items, or what may be called “mastery” examina- 
tions. In such instances perfect or near-perfect achieve- 
ment may be expected of all members of the class. 

For some purposes it is essential that a good exanoi- 
nation yield norms of pupil achievement. These are not 
needed when the only purpose is to measure or com- 
pare individuals tested, but are important when it is 
desired to compare their performance with that of 
other groups. In the latter case norms or standards 
should already be available. 

/ Pupils should not be permitted such choice of ques- 
tions that they can leave out all questions, or the only 
one, on any important topic or phase. It is the custom 
pf many teachers to give pupils some choice of the ex- 
amination questions which they will answer. Ten out 
of twelve, eight out of ten, or some other number may 
be selected and answered. In some cases such choice is 
absolutely free, in others certain questions must be 
answered by all pupils, and the selection is limited 
to the remaining questions. The writer believes that in 



WHAT AKB GOOD EXAMINATIONS? 49 

general such opportunities for selection by pupils 
should be very limited if given at all. There should at 
least be a list of questions covering the miniTYinTn es- 
sentials, the more important principles, or some other 
body of vital content, which should be answered by aU 
pupils. It is not a serious weakness if some selection 
among unimportant details is allowed, but on the whole 
it does not seem best that pupils be allowed any selec- 
tion at all among examination questions intended to 
cover material which should have been studied by all 
the pupils in the class. The only argument for this 
practice which seems to possess any validity is that it 
tends to produce a better feeling on the part of pupils 
and to make them less afraid of examinations. 
Whether or not this is true is almost wholly a matter 
of what pupils are accustomed to, so if they have not 
been used to having a choice of questions there cer- 
tainly seems to be no good reason for introducing it. 
It is said that permitting choice makes the examination 
results fairer, but it does not appear that this argu- 
ment is borne out by the facts. If teachers know that 
pupils have had some choice in selecting the questions 
to be answered they will consciously or unconsciously 
mark them according to more rigid standards and thus 
in the long run whether or not selection is permitted 
does not affect marks. 

It is probably needless to say that the foregoing dis- 
cussion applies entirely to examinations of considerable 
length, perhaps an hour or more. In short quizzes only 
a few minutes in duration teachers should even more 
rarely if ever permit a choice on the part of the pu- 
pils. Furthermore, it should be mentioned that there 
are of course occasions on which it is desirable to test 
noaterial not studied by the whole class in common, 



60 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

such as different books read for outside reading in 
English, and that choice of questions or even individ- 
ual ones may very properly be employed in such in- 
stances. 

An examination should never be used as a penalty or 
punishment, but should be considered as much a part 
of regular school routine as the recitation or the study 
period. Therefore, the writer is very strongly of the 
opinion that a system of exemption from examinations 
for those who earn high class averages should not 
exist. It is, unfortunately, a rather common practice 
for schools to have such systems. The usual provision 
of this sort is that pupils making average class marks 
above a certain point are excused from taking final 
examinations. Sometimes exemption is determined 
separately in each subject, sometimes by the general 
average so that pupils are either exempt in all or in 
none of their subjects. In many, perhaps most, cases 
additional requirements to the effect that unexcused 
absences and numbers of times tardy must not exceed 
a certain limit, are made. Also general deportment 
is often a factor. Numerous persons have argued 
strongly for such a system and many experienced 
school administrators and teachers have expressed 
strong opinions that exemptions bring desirable re- 
sults. One of the best of these arguments is that of 
Morloy (62), who advances the following five points 
in support of exemptions from examinations. (1) Ex- 
emptions do not sacrifice any existing advantages of 
examinations. (2) Superior pupils do just as good 
work with an exemption system as without it. (3) Av- 
erage pupils do better work in the hope that they will 
earn exemptions. (4) Teachers are relieved of much 
marking of papers. (5) The general moral tone of the 



WHAT AEE GOOD EXAMINATIONS? 51 

school is improved. In addition he states that in the 
Cleveland Heights High School, of which he is princi- 
pal, allowing exemptions has caused better daily prep- 
aration, more voluntary research and reading of sup- 
plementary references, less tardiness in preparing 
written assi^ments and notebooks, and more volun- 
teering in daily recitations. 

It is admitted without argument that there are cer- 
tain desirable results produced by exemption from 
examinations, but not that they are all that Morley 
and others claim. Furthermore, these desirable results 
appear to be much more than balanced by the undesir- 
able ones. If the time which the teacher saves in mark- 
ing papers is devoted to some other profitable use this 
is an advantage of exemptions. It is also frequently 
true that pupils who make grades near, either above 
or below, the standard set for exemption are stimu- 
lated to better work during the term or semester in 
the hope of earning exemptions. A third frequent fa- 
vorable result is that the inclusion in exemption re- 
quirements of provisions concerning absences, tardi- 
ness and general deportment is frequently a powerful 
weapon in the hands of principals and teachers to 
secure better conditions along these lines. However, 
most, if not all, of these good results can be secured by 
other methods which do not involve the disadvantages 
and dangers of exemptions. An exemption system is 
not at all sure to improve the general morale of the 
school. In fact it frequently has just the opposite ef- 
fect, inasmuch as it often comes to be looked upon by 
pupils as more or less of a disgrace to have to take 
examinations. This creates a very undesirable attitude 
on the part of most of those who have to do so, and 
sometimes also on the part of those who are exempt. 



52 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

It is doubtful if superior students usually do quite as 
good work without examinations as with them. 

The whole exemption system tends too much to make 
the examination a penalty and a disciplinary device 
rather than an integral and educative part of the in- 
structional process. As has been stated above, one of 
the most important functions of examinations is the 
providing of opportunities for learning and this func- 
tion is of course entirely unfulfilled if examinations 
are not taken. Furthermore, another function, that of 
motivating pupil study and mental activity, may be 
accomplished by examinations for all those who take 
them, whereas exemptions only motivate those who are 
not sure of reaching the required standards, but still 
have some chance of doing so. The other four func- 
tions, which have to do with the measurement of pupil 
ability and achievement, the measurement of teaching 
efficiency, pupil diagnosis, and setting goals of attain- 
ment can be accomplished without examinations, but 
not as well as with them. It is true that if enough tests 
have been given during the course, the final examina- 
tion alone is not vitally important for these purposes 
nor, for that matter, for the other two just referred to. 
On the other hand, however, since it is usually the 
longest and most carefully prepared of all those given 
it should have more value in accomplishing each of the 
six functions than any one or even several shorter 
tests. 

It has sometimes been suggested that examinations 
be required of all pupils in what are often called 
“book" subjects as contrasted with manual training, 
sewing, cooking, drawing and other subjects in which 
the work done by the pupils consists primarily in con- 
structing or producing concrete subjects of various 



WHAT ARE GOOD EXAMINATIONS? 


53 


sorts. It is very probable that tbe omission of examina- 
tions in the “book” subjects results in the .loss of more 
advantages than in the case of the so-called “practical 
arts” ones, but even in the case of the latter exemp- 
tions sacrifice some possible benefits. In the first place, 
each of these subjects has a sufficient body of underly- 
ing items of information and principles that knowl- 
edge thereof can well be tested, opportunity for fur- 
ther learning given, and so forth, by means of written 
examinations. Secondly, it is probably desirable that 
some of the examinations given in such subjects con- 
sist, not of written work, but of actual performance, 
more or less similar to that done in class from day to 
day. Thus at least part of the examination in a public 
speaking course may very well consist of the delivery 
of an actual speech or reading, in a cooking course of 
the actual preparation of some article of food, and so 
on. Probably in most cases there should be some ex- 
aminations of each type, that is both written and ac- 
tual performance. 

Examinations should be constructed and adminis- 
tered so as to discourage bluffing, or guessing and to 
encourage steady and regular study throughout the 
course as the best preparation therefor. Likewise, they 
should discourage mere cramming as distinguished 
from review. 

Examination questions should be definitely and 
clearly stated so that all pupils will understand them 
alike. The accompanying directions also should be 
brief and clear so that pupils will understand just 
what they are to do : the order of answering the ques- 
tions, the arrangement and form of answers required, 
the manner in which they should distribute their time, 
and so forth. 



54 TRADITIONAL EXAMINATIONS AJSTD NEW-TYPE TESTS 

A single examination question should not include a 
number of diverse elements, although it is often de- 
sirable to list together in one exercise a number of sim- 
ilar items, such as terms to be defined, words to be 
translated, and so forth, even though the particular 
items are not closely or at all related to one another. 

If an examination is to cover a whole course or a 
considerable portion thereof it should include some 
exercises which test knowledge of the chief principles, 
facts, and other points in the material to be tested 
upon, and others which make an adequate and compre- 
hensive sampling of the minor details. Furthermore, 
not merely facts, but their relationship, application, in- 
terpretation and so forth should be dealt with. This is 
desirable both for purposes of securing accurate and 
significant measures and because of the ejffect upon 
those being tested. 

An examination covering a large unit of subject- 
matter, or a series of shorter tests, should include ex- 
ercises of a number of t3Tpes so as to approach the sub- 
ject from various phases, measure various abilities, 
and thus yield a more comprehensive total measure. A 
discussion or essay examination should not consist en- 
tirely of cause and effect, analysis, summary, or any 
other single type of question, nor should a series of 
several short new-type tests be limited to any one or 
two varieties thereof. A semester’s testing program 
should include both chief types, and a number of vari- 
eties of each. Not only is variety in the forms used 
desirable, but also in the times at which tests are given 
and the intervals between them. Such variety should of 
course be introduced in accordance with a considered 
and unified plan and should not be merely hit or miss 
or accidental. Among the chief reasons why variety is 



WHAT ARE GOOD EXAMINATIONS? 55 

desirable are that it helps to keep ap interest and 
otherwise stimulate work and that it provides more 
satisfactory all-around measures than if there is too 
great uniformity. The detailed application of the prin- 
ciple of variety will of course vary in different sub- 
jects and with pupils of different degrees of maturity, 
but in all subjects and with all classes of pupils the 
general principle should be followed. There is no sub- 
ject which does not contain subject-matter of various 
types and kinds and each of these should receive at- 
tention in the testing program. Furthermore, in any 
group of pupils some respond best to one variety of 
test, some to another. Some tend to keep their work 
up to date, but to neglect review and what has been 
covered in the past, whereas others tend to do just the 
opposite. Some overemphasize speed and some quality 
or accuracy, some tend to make especial preparation 
for an announced test and others do not. Because of all 
these and other similar causes, the use of a single type 
of examination given under uniform conditions tends 
to yield a partial and unsatisfactory measure of pu- 
pils’ abilities and achievements. On the other hand, the 
introduction of variety gives both better measures and 
better emphasis upon different points or phases which 
need attention. 

It has been advocated by some persons that, con- 
trary to what has been said above, all tests be given at 
regular intervals and therefore of course always 
known in advance. There are certainly not only no 
valid objections to the giving of some preannounced 
tests and perhaps some of these at regular intervals, 
but rather good arguments for the practice. However, 
such a practice should not govern all tests. Eegrettable 
though the fact may be, the giving of frequent unan- 



56 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

notmced quizzes is one of the most powerful instru- 
ments in the hands of teachers for stimulating pupils 
to keep their work up to date. and properly reviewed. 
Some of those advocating tests at regular intervals say 
that these intervals should be very short, even as short 
as a single day. Though there are a number of argu- 
ments in favor of having a short written test at every 
ordinary recitation, it does not appear to be best to do 
so. The chief reason is that the time necessary for such 
frequent tests can be more profitably used otherwise, 
and that after a certain frequency of tests, perhaps 
one every two or three days, is passed, the law of 
diminishing returns applies sufficiently that an in- 
crease in the number of tests is not justified. 

A principle closely connected with the one just dis- 
cussed and in a sense part of it, but worth separate 
mention, is that tests should be relatively short and be 
given' rather often instead of beipg relatively long and 
infrequently given. In general, three ten-minute tests 
at intervals of several days apayt are better than only 
one thirty-minute test during the same length of time, 
and two twenty-five-minute tests are better than one 
fitfty-minute test. For the ordinary junior or senior 
high-school class lasting one semester there should not 
be more than two or perhaps throe long tests, each of 
which consumes a whole class period, besides the final 
examination. There should be a few tests which con- 
sume perhaps half of the class period and a larger 
number lasting only a few minutes apiece. In the ele- 
mentary grades the same distribution of tests of vari- 
ous lengths should be followed, the absolute lengths 
being shorter. The chief advantages of having a rather 
largo number of short tests instead of only a few long 
tests are that the short tests stimulate study and keep 



WHAT ARE GOOD EXAMINATIONS? 67 

up interest better, concentrate attention on a relatively 
small number of items or portion of subject-matter, 
and, by being given and taken under more different 
conditions, yield better average measures of pupil 
achievement. 

The question may be asked as to why it is recom- 
mended that there be even a few rather long examina- 
tions, or, in other words, why the principle stated in 
the preceding paragraph should not be applied more 
completely and generally. The answer is that there are 
certain values which are either inherent in examina- 
tions of more than a few minutes in length, or else 
much more readily obtained from them than from 
shorter ones. One of these values is the engendering of 
the ability to work continuously under pressure for a 
longer period than a few minutes. A second is the cov- 
ering of a considerable portion of subject-matter at 
one time, so that it is all unified in the pupil’s mind. 
Third, it is sometimes desirable to test the pupil’s 
knowledge of a rather large nnit of subject-matter 
through the use of topical and other types of questions 
which cannot be answered satisfactorily in a few min- 
utes. 

The construction of examination exercises should be 
such as to reduce the possibility of guessing and to 
discourage attempts of students to guess the right 
answers if they do not know them. The instructions 
accompanying the new-type tests, such as the alterna- 
tive, multiple-choice and others in which a pupil has 
one chance out of two or more of guessing correctly, 
should be such as to discourage attempts to answer 
those items concerning which the pupil is not reason- 
ably sure. The habit of guessing on an insufficient basis 
is not a desirable product of instruction and should be 



58 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

discouraged. It will be shown in more detail later that 
the methods of scoring such tests should take aoeount 
of the possibility of guessing a certain number of cor- 
rect answers. Pupils should be informed in advance 
that this is true, so that they will realize that in the 
long run guessing is not likely to increase their scores 
and may even lower them. 

3. Summary. Various writers naturally state differ- 
ently the purposes which examinations should serve. It 
seems to the present writer that their functions may 
best be stated xmder six heads: the measurement of 
pupil ability and accomplishment ; pupil diagnosis ; the 
measurement and improvement of teaching efficiency ; 
the provision of opportunities for learning; motiva- 
tion; and the determination of standards or goals. 
Among the many qualities which good examinations 
should possess, three of the most important are objec- 
tivity, reliability, and validity .'^They should also be 
easy to give, score and take, contain few catch ques- 
tions, be neither too easy nor too hard, permit slight 
if any choice of questions, never be used as a penalty 
or required of only a portion of the pupils, contain 
definite and clearly-stated questions as well as direc- 
tions, cover the most important points and sample the 
details, consist of various types of exercises, be given 
at irregular but fairly frequent intervals, and be so 
constructed as to reduce guessing to a minimum. 



CHAPTEE III 

HOW TO MAKE AND GIVE EXAMINATIONS 


I, The preparation of good examinations. The reader 
will recall that the last chapter began by stating six 
functions of examinations. These were followed by a 
list and discnssion of the qualities which examinations 
should possess in order to fulfill the stated functions. 
The mere statements and explanations of some of 
these qualities were snfScient to indicate how to secure 
them, but in other cases this was not true. In this chap- 
ter, therefore, a number of things which teachers 
should do to produce examinations which possess cer- 
tain of the qualities already named will be suggested. 

In the first place, we may well consider how to se- 
cure* validity. There are two chief methods to be em- 
ployed in doing so. These are the comparison of the 
results secured on the test in question with those ob- 
tained from other measures of the'same thing and the 
careful inspection and analysis of the test itself. The 
former method has been very largely used with stand- 
ardized tests, but there is no practicable way by which 
a teacher can employ it in advance to determine the 
validity of the examinations which she prepares and 
gives to her pupils. It requires a comparison of results, 
usually by means of correlation, with those from simi- 
lar tests, daily marks and other measures of the ability 
in question. A teacher can do this with test or examina- 
tion results after the tests have been given and, by 

S9 



60 TRADITIONAL EXAMINATIONS AND NEW-TYTE TESTS 

main Tig a study of the results, learn something as to 
how valid her tests are and probably thus be able to 
improve those which she makes in the future. Still 
further she can select those questions or items which 
appear to have the highest validity and preserve them 
for use on other occasions, and thus in the course of 
time accumulate a rather large number of valid exer- 
cises from which perhaps most of those included in 
future examinations can be selected. On the whole, 
however, the regular class-room teacher will probably 
make little use of this method. 

The second method, that of critical analysis of con- 
tent and form, is one that should be used by every 
teacher in connection with every test or examination 
administered to pupils. It is, however, diflScult to spec- 
ify in detail just how this is to be done. The content 
should be considered in the light of minimum essen- 
tials, curriculum aims, subject-matter covered, objec- 
tives, and so forth, and made to harmonize with them. 
The general form and the wording used, as well as 
what the pupils are asked to do, should be studied as 
carefully as possible in order to ^scover and eliminate 
any disturbing factors which may have entered. Each 
question or exercise should be carefully considered by 
the teacher from the standpoint of just what it really 
measures and how it does so. The comparison of actual 
results with what the teacher expected them to be will 
often reveal faults which can be guarded against in the 
future. 

In practically all cases it is desirable that examina- 
tion questions or exercises be prepared some time in 
advance and ordinarily that more be prepared than 
will be needed. After , they have been laid aside for a 
few days or longer and have largely passed from mind, 



HOW TO MAKE AND GIVE EXAMINATIONS 61 

a second consideration of them may reveal weaknesses 
and suggest methods of improvement not apparent 
when they were constructed, and may also result in a 
better selection of those to compose the examination. 
It is still better if some other teacher or someone else 
qualified to do so can be prevailed upon to examine and 
criticize the list. Thus the judgments of two different 
persons may be compared and only the exercises used 
on which there is substantial agreement. 

In collecting and preparing questions or other exer- 
cises to be used in examinations, it is frequently a de- 
sirable practice to ask pupils to hand in lists of those 
which they consider suitable. Pupils should be given 
opportunity to prepare such lists with some care and 
usually in accordance with general directions given 
by the teacher. Furthermore, it should generally be 
understood that some of the questions actually asked 
upon examinations will be selected from among those 
handed in. In fact, in most eases almost all examina- 
tion questions can be so derived, although modifica- 
tions of form and expression are frequently necessary. 
It is not a matter of great importance that questions 
be so obtained, but certain worth-while results ap- 
pear to ensue from so doing. Probably the most valu- 
able of these is that the preparation of such questions 
serves as a stimulus to careful, thoughtful review by 
the pupils. In the second place, it is likely to encourage 
a more favorable attitude toward examinations on the 
part of the pupils and to make them regard tests less 
as arbitrarily imposed tasks than as profitable exer- 
cises. Moreover the study of questions prepared by 
members of the class is often beneficial to teachers, 
helping them to get a broader view of the subject 
taught. 



62 TRADITIONMi EXAMINATIONS AND NEW-TYPE TESTS 

A teaelier should accumulate suitable exercises for 
use in examinations from time to time. These will in- 
clude exercises previously used, those employed by 
other teachers or on state, county and other officially 
prepared examinations, exercises which have come to 
the mind of the teacher while a particular topic was 
being covered, questions from members of the class 
and material from various other sources. These should 
be so kept and classified that the teacher can rather 
easily and in a comparatively short time assemble the 
material for an examination. In the case of those which 
she has used previously brief notes should be made as 
to how satisfactory they were, whether they were too 
hard or too easy, clearly worded or apparently misun- 
derstood by some members of the class, about how 
much time was required to answer them, and so forth. 
Thus, instead of each new test or examination being 
constructed merely for the occasion, the teacher can 
both capitalize her past experience and reduce the la- 
bor involved. 

It is very often desirable to repeat, frequently in 
a different form, questions or exercises used on previ- 
ous tests. Ordinarily the ones so repeated should be 
those upon which the class did not do very well and to 
which further attention has been given by the teacher 
in the hope of remedying the situation. Occasionally, 
however, questions which, on the whole, were answered 
satisfactorily should be repeated to make sure that 
the material dealt with is being held in mind and not 
merely learned temporarily and then forgotten. Some- 
times, especially in a new-type examination consisting 
of a rather large number of exercises or items, it is 
desirable to follow the procedure just suggested within 
the same test, perhaps approaching the same fact or 



HOW TO MAKE AND GIVE EXAMINATIONS 63 

principle from two viewpoints or even merely using 
two differently worded statements which, mean prac- 
tically the same thing. 

In considering an exercise to be used on an examina- 
tion, the teacher should consider the form, length and 
other details of the answers which pupils will probably 
give. Not infrequently so doing will prevent the use of 
questions which would be misinterpreted or otherwise 
handled in a manner different from that desired. 

There are at least two methods of constructing ex- 
aminations to insure that they measure the total range 
of abilities writhin a particular group of pupils. The 
teacher may insert in each test one or more questions 
easy enough that even those individuals of least ability 
or achievement are able to give satisfactory answers 
and one or more difficult enough that the best in the 
class cannot answer them perfectly. The second 
method is to make the amount of time allowed such 
that every pupil is able to complete at least one ques- 
tion, but no pupil to finish all. 

In the course of a semester’s or year’s work a 
teacher should give a number of short tests of a few 
noinutes in length and several longer ones consuming 
from a half to a whole class period or perhaps occa- 
sionally even more. Some of these tests should be of 
various objective or new examination types, others of 
the traditional type; some, usually the longer ones, 
should be announced and some unannounced; some 
should emphasize accuracy or quality, some speed, and 
some neither one markedly more than the other. Some 
should be factual or informational, some should test 
reasoning ability, some power to apply what has been 
learned, and so forth. Some should be upon the work 
assigned for the days on which they are given, others 



64 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

upon that immediately preceding, still others upon 
work further back. Some should deal more or less in- 
tensively with one rather narrow topic or subject, 
some cover a much wider field in much less detail, 
some should do a portion of both. 

2. The administration of examinations. The intervals 
or times at which tests are given should be largely 
determined by the units or natural divisions of the 
subject-matter and not by arbitrary divisions of the 
course into portions, each of which covers about 
the same length of time and is followed by a test. Also 
the performance of the members of the class, their at- 
titude and what may be called their mental health, 
should play a part. Thus if it is evident from oral reci- 
tations or other sources that a particular bit of subject- 
matter has been well mastered, it may be touched upon 
lightly if at all in tests given soon after that time, 
whereas one which appears not to be so well mastered 
should be covered more thoroughly. Moreover, if it 
seems that the class is neglecting its work, that there 
is a tendency to be a day or two behind, or to assume 
that something once recited upon need not be retained, 
one or several short tests may help to correct the 
difficulty. 

The length of time devoted to tests and examina- 
tions should, as has already been said, differ both ac- 
cording to the maturity of the pupils being tested and 
from time to time in the same class. It is probably a 
satisfactory rule to say that except for final examina- 
tions no test should continue longer than the ordinary 
recitation period of a class and that each final ex- 
amination should be about twice the length of an ordi- 
nary recitation period. In other words the writer 
believes that an hour and a half is a desirable length 



HOW TO MAKE AND GIVI EXAMINATIONS 66 

of time for a final examination in high school. This 
time should be decreased by about ten minutes for each 
grade below the high school, thus becoming about 
eighty minutes for the eighth grade, seventy for the 
seventh, and so on. 

In the timing of most tests given by the teacher it is 
not highly important that time be determined accu- 
rately to exact seconds or even the nearest five or ten 
seconds. For standardized tests which have definite 
time limits and for occasional short speed tests which 
the teacher may give such a high degree of accuracy is 
essential. In such cases, a stop watch is desirable, but 
it is not necessary, and it is not recommended that 
such watches be secured by teachers merely for use in 
giving regular class-room tests. A clock or watch with 
a second hand should be available, however, and em- 
ployed for timing purposes. Pupils should understand 
that they are expected to begin and stop promptly and 
should be trained so that they do so. For short speed 
tests it is desirable to follow the procedure recom- 
mended in the case of many standardized tests or some 
other very similar to it. After whatever preparations 
are necessary have been made each pupil should raise 
his right hand (or his left one if he writes with it) from 
the desk with his pen or pencil therein and look di- 
rectly at the teacher until the signal to begin is given, 
at which time he should start work at once. Similarly, 
when the teacher gives the signal to stop pupils should 
at once look up and raise their hands from their pa- 
pers. Such exact precision at beginning and quitting 
is not essential in connection with tests which are not 
primarily speed tests or which last for fairly long 
periods of time, but promptness should be insisted 
upon. Pupils should form the habit of going to work 



66 TRADITIONAL KXAMINATIONS AND NEW-TYPE TESTS 

at once. This does not mean that they shonld always 
begin to write immediately, since frequently it is de- 
sirable to spend some time in looking over the ques- 
tions and planning what their answers will be before 
actually starting to write. Also, although they should 
not be expected to stop writing in the very middle of a 
word, they should not be allowed to continue for a 
minute or two after time is called. In general the pro- 
xdsion that pupils shall not do more than complete the 
sentence upon which they are writing when time is 
called is fairly satisfactory for traditional examina- 
tions. 

In tests which contain general discussion or essay 
questions warning should be given shortly before the 
time is up, how long before depending largely upon the 
total length of the test. On the average it is rather sat- 
isfactory to give warning when about 10 per cent of 
the total time still remains, that is, when two minutes 
are left on a twenty-minute test, perhaps five minutes 
on a forty-five minute one, ten minutes on an hour and 
a half examination, and so on. The chief reason for 
giving such warning is to allow pupils the opportunity 
to round out and complete their discussions as best 
they can in the time left rather than to leave them in 
incomplete form, and to afford the opportunity of put- 
ting down the most important part of what they have 
planned to write, if there is not time to record all of it. 

Pupils should be directed and encouraged to form 
the habit of taking time to think over and organize 
the material for the answers to general discussion and 
perhaps other types of questions before writing out 
their responses in final form. On a test composed of 
short memory items there is no need of opportunity 
for so doing, but upon practically all others the an- 



HOW TO MAKE AND GIVE EXAMINATIONS 67 

swers will be improved more than enough to pay for 
the additional time required. Sometimes this prelimi- 
nary organization can be done entirely in the mind. At 
other times, notes, brief outlines, or other material 
may be jotted down to serve as the basis for the final 
answer. It is possible to go too far in this direction and 
to spend so much of the time allowed in planning what 
the answer will be that there is not sufficient time left 
in which to write it satisfactorily. With most ele- 
mentary and high-school pupils this danger is more 
apparent than real and the tendency is for them to 
write too hastily rather than too deliberately. No rule 
can be laid down as to how much of the total time al- 
lotted to an essay question should be used for this 
purpose but it may be suggested as a very general rule 
that frequently 20 or 25 per cent of the time may be 
profitably employed in such preliminary planning. 

If responses are called for in some form with which 
the pupils are not well acquainted, examples or illus- 
trations should usually be employed to make clear 
what is wanted. On the other hand, if a commonly used 
form with which the pupils are thoroughly familiar is 
employed, not only will examples be superfluous but 
practically no directions at all will be necessary. The 
pupils may merely be instructed to do as they have 
done before, or this can even be so well understood 
that they do not need to be directly told. In any event, 
they should understand clearly, either from previous 
practice or definite directions, such points as the order 
in which questions are to be answered, the arrange- 
ment of their answers, the approximate amount of 
time to be devoted to each, and so forth. It is generally 
best to instruct pupils to go through the exercises, 
first doing all that they are reasonably sure of and 



68 TEADmOKAL EXAMINATIONS AND NEW-TYPE TESTS 

leaving space, in the proper place, for each of the 
others. Also a warning should be given against spend- 
ing too much time on a few questions with the result 
that the others must be slighted or even omitted. In 
some cases, however, because of the content and ar- 
rangement of a test, pupils should do each item or ex- 
ercise, if they can do it at all, before proceeding to the 
next one. 

Some examinations should be given during which 
pupils are allowed to make use of textbooks and other 
helpful material. This of course applies to questions 
which call for the application, interpretation, evalua- 
tion and so forth of facts and principles, and not for 
their mere repetition. In mathematical subjects such 
occasions are rather frequent, but in many others also 
this practice should be followed on some occasions. In 
other words, some examination exercises should re- 
quire the pupil to make the best use he can of material 
which he has memorized without other help, others of 
material found in certain available but limited sources, 
and still others of all available material. In this as in 
many other things examination practices should con- 
form to life experiences. Sometimes individuals must 
deal with a situation without any assistance at all, re- 
lying entirely upon the content of their minds. At other 
times a limited amount of reference material or other 
assistance is available and at others they have the op- 
portunity to make use of all, or practically all, by 
which they can profit. 

The questions used on some examinations, or at 
least a part of them, should be given to the class some 
time in advance so that the pupils may have opportu- 
nity to prepare as definitely and specifically as pos- 
sible to answer them. Thus their ability to make such 



HOW TO MAKE AND GIVE EXAMINATIONS 69 

preparation can be measured. Since it is probable that 
some readers will disagree with this suggestion, what 
is perhaps the best argument for this practice is sum- 
marized below. It is by Gathany (29) who, however, 
does not maintain that all questions should be revealed 
in advance. He discusses history examinations particu- 
larly, but practically all that he says applies to other 
subjects as well. He states that as examinations are 
usually given they have at least six bad results or 
points of weakness, as follows: 

(1) They encourage much waste of time and the spirit 
of taking a chance, in that pupils devote considerable time to 
trying to guess what will he asked. 

(2) They disappoint and discourage pupils who have made 
serious preparation for them, but just happened not to be 
prepared upon some of the detailed points ai^ed. 

(3) They are not fair to pupils or teachers because they 
yield only inadequate samples of pupils’ ability and achieve- 
ment. 

(4) They do not offer an incentive to thorough preparation 
because what will be asked is too much a matter of chance. 

(5) They do not take into account the fact that high-school 
pupils have little sense of proportionate value and that in pre- 
paring for examinations they will often devote considerable 
time to memorizing or studying rather unimportant details, 
but little time to much more important matters. 

(6) They cause pupils to disiike and dread examinations 
very much. 

Gathany states that telling pupils at least some of 
the examination questions a week or more in advance, 
or perhaps even before the topic which the examina- 
tion is to cover has been studied, will do away with 
the six adverse criticisms which he lists and in addi- 
tion will yield three other advantages. These are the 
abolition of the necessity for several review lessons 



70 TRADITIONAL EXAMINATIONS AND NEW-TYPB TESTS 

before each examination, the eneonragement or even 
almost forcing of the nse of textbooks for reference 
purposes, and the absorbing of more real historical 
knowledge by pupils. The writer is not inclined to go 
as far as Gathany and a few others in this matter, and 
does not believe that all the advantages claimed for 
announcing examination questions in advance neces- 
sarily follow. He does believe, however, that it is good 
practice to reveal some examination questions to the 
class long enough in advance of the time of the ex- 
amination to enable pupils to make specific prepara- 
tion on the topics dealt with. Some of these may well 
be given in the exact form in which they will be asked, 
others by announcing that certain rather definite 
topics will be covered in the examination. Those which 
are given in exact form may in some instances consist 
of rather short lists of important facts which are to 
be memorized, such as the names of the oceans and 
continents in a lower-grade geography class, of the 
presidents in a history class, of a certain table of 
measures in arithmetic, or of a list of symbols in chem- 
istry. More often, however, the questions given in ad- 
vance should be those requiring discussion and calling 
for answers which are not absolutely right or wrong, 
but require mastering and organizing a certain body 
of material. Examples of such questions are: “State 
as best you can in 200 words the part played by Presi- 
dent Wilson in connection with the Treaty of Peace 
and the League of Nations” and “State as many defi- 
nite reasons as possible why the Elizabethan Age pro- 
duced a large number of writers.” 

In the giving of all tests in the lower and intermedi- 
ate elementary grades the directions, whether con- 
tained on the test papers or not, should be read aloud 



HOW TO MAKE AND GIVE EXAMINATIONS 71 

by the teacher. Indeed, this should also frequently be 
done in the upper elementary and high-school grades. 
If the directions are also contained upon the papers, 
pupils should be instiructed to read them silently as 
the teacher reads them aloud. 

If an examination consists of very many questions 
or items, a copy should be placed in the hands of each 
pupil. This helps greatly in insuring that the questions 
are understood. When it is impracticable or unneces- 
sary to do so, the questions should be clearly and 
plainly written on the board and also read to the class. 
A single brief question, or perhaps two or three with 
older pupils, may be given only orally. 

In the previous chapter it was stated that little if 
any choice of questions should be allowed pupils. If 
any freedom of selection is allowed, what is probably 
the best method of doing so is to allow choice only 
within questions, each of which contains a number of 
rather similar parts dealing with the same kind of 
material. For example, in foreign language a pupU 
may be allowed to select six out of eight verbs of which 
he is to give the principal parts, two out of three 
passages to translate, and so on. In history, he may be 
allowed to select three of the presidents since 1900 as 
those whose administrations he is to summarize, or in 
literature to choose one of two books which he will dis- 
cuss. Thus, no pupil can, if he answers as many ques- 
tions or parts of questions as he is supposed to, omit 
any of the large general topics covered by the exami- 
nation. 

Especially in the scoring of new-type tests, but some- 
times in that of traditional examinations, the question 
commonly arises as to how to score when pupils have 
not followed directions. In cases in which the intent of 



72 TKAIUTIONAL EXAMINATIONS AND NEW-TYPE TESTS 

the pupil is clear, the writer believes that he should be 
given credit for his intent, and not penalized because 
of failure to follow directions exactly. For example, in 
a true-false test in which answers are to be written in 
front of the sentences, they may be placed after them. 
In a multiple-answer test in which answers are to be 
underlined, they may be encircled or otherwise indi- 
cated. If, however, after being warned, a pupil per- 
sistently continues to record his answers in different 
ways than directed, it is perhaps not unjustifiable to 
penalize him more or less severely. In other cases, his 
failure to follow directions should be scored the same 
as if he had given no answer at all. If, for example, one 
answer in each set is to be indicated in a multiple- 
answer test, and two or more are marked, or if only 
one word is to be written in each blank in a completion 
test and two or more are inserted, the correct pro- 
cedure is not to give the pupil credit even though one 
of his answers is correct. In the case of a discussion 
question in which it is stated that pupils are to write a 
certain definite amount, such as one page, probably the 
best method of scoring is to neglect whatever has been 
written in excess of this amount and to give the pupil 
what he deserves upon the first page of his answer. 
Certainly he should not be denied any credit at all 
merely because he wrote too much nor would it be fair 
to other pupils who kept within the specified limit to 
mark him upon the content of all he wrote. 

In connection with ordinary tests and examinations, 
the corrected papers should be returned to the pupils 
as soon as possible. The reason for so doing is that, if a 
number of days elapse before papers are given back, 
pupils are liable to forget, many things concerning the 
examination and why they gave the responses they did, 



HOW TO MAKE A2TO GIVE EXAMINATIONS 73 

and therefore lose much of their interest in the mat- 
ter. If, however, the corrected papers are returned 
within a day or two and the results clearly and fully 
discussed by both pupils and teacher, many of the er- 
roneous conceptions indicated can be corrected and 
many of the items not known can be taught. In connec- 
tion with practically all tests and examinations ex- 
cept those given at the end of the semester or year, at 
which times it is commonly impracticable to return 
them, this should be one of the most important phases. 

At this same time that the papers are returned with 
the pupils’ marks upon them, the teacher should give 
information which will enable each pupil to get a fairly 
accurate idea of his standing in comparison with the 
rest of the class. There are various ways of accom- 
plishing this. Sometimes it is sufficient to announce the 
highest score made, the average or mid-score, and the 
lowest score. It is, however, perhaps better to place 
on the board the whole distribution of scores showing 
how many pupils received A’s, B’s, C’s, and so forth, 
if these symbols are employed, or, if a percentile or 
other point system is used, how many received between 
90 and 100, how many between 80 and 90, and so on. 
Sometimes this can well be done graphically by the 
construction of a figure which shows the distribution 
of the scores of the whole class. Indeed, if pupils are 
taught to understand such graphical representations, 
this method is perhaps the most vivid of any. 

In whatever way the scores of the class are an- 
nounced or revealed, those of individual pupils should 
ordinarily not be made known. Sometimes it is desira- 
ble to announce the one or several of the pupils making 
the highest scores or perhaps all those above a certain 
point. In general, however, a pupil’s score or mark 



74 TBADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

should be regarded as the property of the pupil and 
the teacher, and therefore as something not to be re- 
vealed to the other pupils unless the one receiving it is 
willing to do so. This should not be interpreted in an 
extreme fashion to justify an air of great secrecy and 
the taking of elaborate precautions to insure that no 
pupil finds out the mark of another pupil unless the 
latter is entirely willing. Ordinarily, however, no good 
purpose appears to be served by revealing a pupil’s 
score to the rest of the class. 

In connection with the giving of examinations and 
tests the problem of honesty and the prevention of 
cheating on the part of pupils is frequently somewhat 
prominent. The question as to whether or not there 
should be an honor system of some sort in connection 
with examinations has been largely debated in colleges 
and universities. Moreover, such a system has been 
tried frequently and is now in use in a number of insti- 
tutions. It has received less attention in the high school 
and still less in the elementary school. Largely because 
of this fact, that it is not the prevalent practice, and 
because the weight of opinion is against its use, it is 
not among the purposes of this volume to enter into an 
extended discussion of the honor system. 

A device which is probably used more often than the 
regular honor system and which is in one sense a modi- 
fication of it although in another sense it is not, is the 
requirement that at the close of an examination pupils 
sign a pledge even though the teacher has been present 
during the whole examination period. This pledge is 
to the effect that they have not received any help on 
the test and perhaps also that they have not given any. 
If such a pledge were used with the teacher absent, it 
would of course be considered an honor system, but 



HOW TO MAKE AND GIVE EXAMINATIONS 75 

•with the teacher present it is not usually so called. The 
use of a pledge of the sort indicated as -well as that of 
the regular honor system is debatable, but here also 
the bulk of present practice and belief is opposed to it. 
It is universally held that it is desirable to try to de- 
velop an attitude of honor which is opposed to cheat- 
ing, but not that it is desirable to require pupils to sign 
a pledge that they have not received or given help. 
Some studies, especially one by Doyle and Foote (20), 
based upon pupils’ statements and other data seem 
to indicate that the signing of a pledge only moder- 
ately reduces cheating, whereas it is responsible for 
adding falsehood to cheating in the cases of many of 
the pupils. It is therefore recommended that neither 
a complete honor system nor a pledge of honest work 
be employed in elementary or high school, but that 
teachers attempt to develop an attitude of honesty so 
as to reduce the spirit of cheating and also to prepare 
and administer examinations of such a nature and in 
such a way as to lessen the amount by which pupils can 
profit by cheating and therefore the strength of the 
temptation to cheat. 

This naturally brings us to the question as to what 
can be done to make undetected cheating more difficult 
and to prevent pupils from profiting largely by it if 
it is successful. Practically any experienced teacher is 
well aware of a number of things to do which will ac- 
complish the first aim fairly well, but how to accom- 
plish the second is either generally less known or if 
kno'wn not carried out. To return to the former, prob- 
ably the chief means of preventing cheating, apart 
from the development of an honest attitude, is to ar- 
range the pupils so that it is not easy for them to see 
one another’s papers and to watch them carefully 



76 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

while they are at work on the examination. In a class 
of ordinary size it should not be overly difficult for a 
wide-awake teacher to watch each of her pupils suffi- 
ciently to be reasonably sure that no considerable 
amount of cheating takes place. A pupil may now and 
then steal a hasty undetected glance at someone else’s 
■ paper, or even at material of some sort which he has 
prepared and secreted about his desk or person. The 
chances are unfavorable, however, that he will be able 
to do so often enough to make any material difference 
in the quality of his answers without being detected by 
an alert teacher. By observing the faces and general 
actions of the pupils, a discerning teacher can usually 
distinguish those who are looking for opportxmities to 
cheat from those who are working honestly. The for- 
mer will frequently be watching her or they may 
have their gaze fixed intently upon another pupil’s pa- 
per or something else in a way that indicates they are 
not merely looking away from their own paper while 
thinking what to write. With a small class, a teacher 
can often remain seated most of the time in such a 
position as to watch satisfactorily, but with one of the 
ordinary size she should move around the room more 
or less and perhaps remain standing during most of 
the examination period. 

One of the most effective ways of preventing any 
sort of cheating except looking on another pupil’s pa- 
per or in some other way securing help from a fellow 
pupil is to employ questions of such a sort that a hasty 
glance at the textbook, previously prepared notes, or 
other material will afford little or no help. In other 
words, questions which call for reorganization of ma- 
terial previously studied, for new applications, for 



HOW TO MAKE AND GIVE EXAMINATIONS 77 

original illustrations and examples, and so on, render 
it practically impossible for the pupil to cheat by the 
previous preparation of material which will be of any 
considerable advantage to him. Of course, as has been 
mentioned elsewhere, all tests and examination ques- 
tions should not be of this sort. Some should deal with 
mere memory items and facts and it is upon these that 
pupils can improve the quality of their answers by 
hasty glances at the textbook and other sources of in- 
formation. 

Various writers have suggested more or less elab- 
orate plans of giving examinations, usually of the ob- 
jective type, so as to eliminate cheating or be able to 
detect it if it occurs. Every teacher is familiar with 
such general methods as the preparation of two or 
even more sets of questions and the distribution of 
them to the pupils so that in no case do pupils seated 
side by side or in some other advantageous position to 
look at one another’s papers, answer the same set. 
This, however, doubles or more than doubles the labor 
of preparing questions and, except in unusual cases, 
is not recommended for use. An easier method posses- 
sing many of the same advantages may be used when 
the examination exercises cover more than one sheet. 
This is to use the same questions for all pupils, but to 
have the sheets arranged in different orders and to re- 
quire pupils to answer the items in the order in which 
they occur. For example, if there are three sheets of 
examination exercises, every third pupil would have 
his arranged in 1, 2, 3 order ; a second third would be- 
gin with sheet 2 and follow it by 3 and 1 ; the other 
third would have sheet 3 first, then 1 and 2. Thus dur- 
ing a large portion of the examination period adjacent 



78 TRADITIOITAL EXAMINATIONS AND NEW-TYPE TESTS 


pupils would not be working upon the same sheet at 
the same time and so could derive little benefit from 
glancing at another’s work. 

One means of detecting and thereby preventing 
cheating which should not be overlooked is that of 
comparing pupils’ answers in cases in which it is sus- 
pected cheating has occurred. One cannot always be 
absolutely sure of the facts even after a careful com- 
parison is made, but in many cases the evidence will 
be strong enough one way or the other to be fairly con- 
clusive. The pupil who cheats is rarely shrewd enough 
to modify the material copied from someone else’s pa- 
per sufficiently that it cannot be recognized as copied. 
Mistakes are usually copied along with correct an- 
swers. As was suggested above probably the easiest 
exercises from the standpoint of pupils copying from 
their neighbors are those which call for single items of 
information, including a number of the new examina- 
tion types, such as true-false, matching, completion, 
and so on. A very rapid glance is sufficient for a pupil 
to see what his neighbor has answered to one or even 
several exercises. If only a few answers are copied or 
if those which are copied are not all taken from the 
same pupil the detection of cheating in such eases is 
fairly difficult. A teacher can, however, by simple sta- 
tistical procedure determine to what extent the mis- 
takes of pupils chosen at random are alike and can be 
reasonably sure that in the ease of any two neighbor- 
ing pupils whose papers are much more alike than 
this, at least one of them copied from the other. All 
that is necessary is to compare a number of pairs of 
papers chosen at random, except that no pair is so 
selected that one of the pupils could have copied from 
the other, and then to count the number of identical 



HOW TO MAKE AND GIVE EXAMINATIONS 79 

errors on each pair. These numbers are added and 
divided by the number of pairs to give their average. 
If a teacher does this and finds that the average number 
of identical errors is three, for example, and in no ease 
are there more than five, she can be reasonably sure 
that if a pair of pupils occupying adjacent seats have 
six or eight or ten identical errors, one of them must 
have copied from the other. In such cases, except at the 
very beginning of a course, a teacher usually knows 
enough about the pupils to be fairly sure which pupil 
did the copying, even if her suspicions were not 
aroused by anything observed during the examina- 
tion. 

Finally, there is at least one more thing which can be 
done to reduce the likelihood of cheating upon exami- 
nations. This is to present examinations as integral 
parts of the school procedure and as having other pur- 
poses than merely to assist in the determination of 
marks. As a result, pupils do not feel that tests are 
something to be dreaded or something in the nature of 
a contest between themselves and the teacher in which 
they are free to make use of any means whatsoever of 
getting the better of the teacher. Ordinarily, many pu- 
pils who do not attempt to cheat during the regular 
recitations of a course will attempt to do so upon tests 
and especially upon final examinations. If, however, 
they are brought to have the same attitude toward ex- 
aminations as toward daily recitations they will tend 
not to cheat upon the former. One element in this situ- 
ation is that they frequently believe, often justifiably 
though perhaps sometimes not so, that their passing 
or some other important question is almost entirely 
dependent upon examination marks alone. If they real- 
ize that, as should be the case, their work during the 



80 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

term or semester plays a large part in determining 
their final marks and that the final examination alone 
does not receive nndue weight they will feel much less 
temptation to cheat. 

3. Summary. Tests may he made more valid by 
statistical studies and by critical analyses of content 
and form. Examination questions should be collected 
and accumulated and used in the construction of spe- 
cific tests. A complete testing program should include 
a number of short tests and a few longer ones given at 
more or less irregular intervals, and embodying exer- 
cises of at least several different types. Timing should 
be exact for short tests and fairly so for long ones, and 
pupils should be prompt in beginning and quitting. 
Also they should be encouraged to plan their answers 
before actually beginning to write. If unfamiliar types 
of exercises are used, examples should be given. In 
some cases pupils should be allowed access to text- 
books and other material, and also on some occasions 
the test questions should be given to them some time in 
advance. It is desirable that a complete copy of the 
examination be placed in the hands of each pupil. Lim- 
ited, if any, choice of questions should be allowed. Cor- 
rected papers should be returned promptly and dis- 
cussed with the pupils. Each pupil should bo told how 
he stands with regard to the class as a whole. The best 
way to reduce cheating is to endeavor to build up a 
spirit of honesty, to construct examinations so that 
cheating will not be of great avail and to make it as 
difficult as possible to cheat without being detected, 
rather than to make use of an honor or pledge system. 



CHAPTER IV 

SCOEING PUPILS’ RESPONSES 

I. The weighting of exercises. Not only is it essential 
that examinations be properly prepared and adininis> 
tered if they are to fulfill their functions adequately, 
but it is also important that they be scored correctly. 
The discussion of their construction and use has per- 
haps implied certain principles and suggestions as 
to scoring, but has not treated the matter explicitly. 
It is, therefore, the purpose of this chapter to deal 
■with a number of the most vital points which arise in 
this connection. 

One of these is the question of weighting. There has 
been considerable argument as to whether or not ex- 
amination questions should be unequally weighted, and, 
if so, still more as to what should be the basis for de- 
termining the weights assigned them. It probably has 
been and still is the most common practice of teachers 
to make their examinations consist of a number of 
questions each of which has the same weight in de- 
termining the total mark given. Ten questions each of 
which counts 10 points are commonly used, as also are 
eight questions each counting 12% points, twenty each 
counting 5 points, five each counting 20, and so forth. 
Largely as a result of the careful work done in de- 
termining the difiSculty of the exercises included in 
many standardized tests, attention has been directed 
to the fact that examination questions or items, even 

though seemingly similar, vary a great deal in their 

81 



82 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

difficulty. Therefore the practice of weighting all 
equally, has often been strongly condemned. It has been 
recommended that teachers weight examination ques- 
tions or test items by the same procedure as has often 
been applied in the construction of standard tests, 
though not trying them out on so many pupils nor 
carrying out all of the possible refinements of method. 
It is true that for purposes of securing exact measures 
of how difficult tasks pupils can perform, such determi- 
nations are necessary, but in most situations this is 
not the primary object of the examinations employed 
by teachers. Moreover the labor of doing so is consider- 
able and unless the results therefrom are clearly 
proven to be of considerable value, one is not justified 
in insisting that teachers perform this extra amount of 
work. 

Somewhat more recently a second basis of weight- 
ing has been recommended and employed. This plan 
advocates that the number of points of credit allowed 
for a particular item or exercise be determined by its 
social value, or, in other words, by its relative impor- 
tance, not by how difficult it is. It can readily be seen 
that this theory of weighting will frequently, or even 
usually, give results very different from those obtained 
if difficulty is used as the basis. For example, there is 
no doubt that it is of much more social importance 
for pupils to learn simple combinations in addition, 
such as 2 -f- 2, 3 -1- 4, and so forth, than it is for them 
to be able to extract cube roots or to solve complex 
fractions. Yet on the basis of difficulty the ability to do 
the latter would be rated many times as high as that 
to do the former. To take another example from a 
different field, that of history, most if not all persons 
would agree that it is much more important for pupils 



SCORING PUPILS’ RESPONSES 


83 


to know what occurred on such dates as 1492 and 1776 
than for them to know the same fact for 1702 or 1882. 
If, however, the relative difl5culties of these dates were 
determined by the proportion of people, either children 
or adults, able to name the events which occurred in 
those years, it would be found that 1492 and 1776 were 
much more generally known and therefore, in a sense, 
much easier. 

If the decision is reached to employ this second basis 
of weighting in preference to the first, the difficulty is 
by no means solved. It still remains to determine the 
relative importance of the various parts of an ex- 
amination or test. Attempts have been made to do this 
for the items composing certain bodies of subject- 
matter, usually in connection with the determination of 
minimum essentials. Sometimes the method of count- 
ing or determining occurrences or occasions for use 
outside of school has been used. In other studies, 
opinions of those supposedly competent to give them 
have been collected and from these tabulated opinions 
an index number which purports to indicate relative 
importance has been computed and assigned to each 
item. Such a method is, however, not practicable for the 
use of the ordinary class-room teacher even if perfectly 
valid, which it is not. Teachers can in some cases profit 
by the results of such studies and to some extent be 
guided by them in weighting items, but in no subject 
have complete lists of such determinations been made 
and in many subjects none at all. In practice, there- 
fore, what teachers must usually do if they wish to 
assign weights according to value or importance is 
merely to rely upon their own judgments. These judg- 
ments, of course, should be guided by some knowledge 
and study of what has been done along the lines in- 



84 TRADITIONAL EXAMINATIONS AND NBW-TXPE TESTS 

dieated and also by the opinions of competent persons 
as to what should be the objectives of instruction. 

Anyone familiar with the standardized test move- 
ment is undoubtedly aware of the fact that within the 
last few years there has been a tendency away from 
careful and exact weighting of all items. This has 
largely been due to two causes. The first, to which 
reference has already been made, is the realization 
that the relative diflSculty of the various items is not 
necessarily, or even probably, proportional to the num- 
ber of points of credit which should be allowed for 
answering each correctly. The second, however, has 
probably been more potent in causing the tendency 
just mentioned. It is that statistical studies have shown 
that if the number of items in a test is fairly large the 
correlation between weighted and unweighted scores 
is so high that the supposedly slight gain in accuracy 
due to attempted exact weighting is not worth the time 
and trouble required. Several studies of standardized 
tests have yielded coeflScients of correlation between 
weighted and unweighted scores of .95 and above, even 
.99 being reported in one ease ( 19 ), and practically all 
the results from such studies have been above .90. A 
number of the most widely used standardized tests, 
such as Charters’ Language and Grammar Tests and 
Monroe’s Standardized Silent Reading Tests, first ap- 
peared with weights attached to each exercise or item, 
but have in their revised forms dropped these weights 
and now consider all items as of equal value in deter- 
mining the score. Because of the evidence just referred 
to it has been suggested that the common practice of 
teachers of allowing the same number of points for 
each question on an examination is after all rather 
satisfactory. There is, however, one partial fallacy in 



SCOBING PUPILS’ RESPONSES 


85 


this application of the facts just mentioned. It is that 
many tests prepared by teachers contain only a few 
different questions or items and that for a small num- 
ber such as eight, ten, twelve or even twenty exercises 
the correlations between weighted and unweighted 
scores are rarely as high as the figures mentioned 
above, which are based on tests containing, for the 
most part, larger numbers of exercises. 

In addition to the question of whether the measures 
or scores which result from weighted items are more 
accurate and valid than those from unweighted ones, 
there is another viewpoint from which the matter 
should be considered. It is that of the effect upon the 
pupils and their attitude toward the examination ex- 
ercises. One of the desired outcomes of instruction is 
the ability to weigh values, to judge relative impor- 
tance, If all exercises are rated as of equal value, or 
apparently so because the same amount of credit is 
given for a correct answer to one as to another, pupils 
are to a certain degree impelled to consider one thing 
as important as another. On the other hand, if the more 
important questions are allowed the larger weights and 
pupils know these weights, one result is that they are 
given some idea of relative values. 

In view of the preceding discussion, it is recom- 
mended that for essay and other examinations con- 
sisting of a relatively small number of parts the teacher 
detemoine the number of points credit to be allowed 
for each part according to her best estimate of its im- 
portance. Furthermore, as a practical device for mak- 
ing scoring easier the number of points allowed for 
any question which is divided into parts or contains 
a number of different items should be such that the 
value of each item remains an integral number. For 



86 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

example, if a question contains six parts, the total num- 
ber of points credit counted should not be 5, 8, 10, or 
15, but 6, 12, 18 or some other number evenly divisible 
by six. Where there are a number of such items in a 
single exercise it is practically always satisfactory to 
count the same number of points credit for each and 
not to carry the principle of weighting according to 
relative importance to very detailed application. In 
the case of a new-type or other examination containing 
a number of rather small items, it does not seem worth 
the time and effort involved to attempt to assign dif- 
ferent weights to them. Instead, each should count the 
same number of points. If such an examination is com- 
posed of parts which differ from one another in the 
forms used, such, for example, as one containing a 
number of multiple-answer items and another a num- 
ber of completion exercises, it is frequently desirable 
to count more for each item of one sort than for each 
of another. Thus in a case such as that just mentioned 
a pupil’s score might be determined by allowing one 
point for each item correct on the first part and two 
for each blank properly filled in on the second. In a 
later chapter where the different types of objective 
tests are discussed more will be suggested as to their 
relative diflSculty and weight. 

2 . Scoring excimination and test papers. The first 
and most important rule to be laid down in connection 
with the general scoring of pupils’ papers is that if 
the examination or test is worth giving at all the an- 
swers are worth scoring carefully. A teacher should 
devote as much care and attention to this as to any 
other phase of her duties as a teacher. In some cases 
much or even all of the scoring can be done by the 
pupils under the teacher’s direction, but in many 



SCOEHTG PUPILS’ RESPONSES 87 

others, especially in connection with final examina- 
tions, the teacher should and must do it herself. As has 
been suggested elsewhere it is desirable to formulate 
examination exercises so that the labor of scoring an- 
swers is reduced to a minimum. There is no virtue in 
merely putting in long hours reading and rating pupils’ 
responses. Whether the time required is one minute 
or ten minutes per pupil, however, the work should be 
done as carefully as possible. 

Except in the case of test questions to which the an- 
swers are very simple and definite, it is generally de- 
sirable and advantageous for a teacher to write out 
the answers which she considers correct. In some cases 
she may merely formulate them definitely in her own 
mind. She should not wait until the time of scoring the 
papers arrives to prepare her answers, but should do 
so when making up the examination questions. So doing 
may lead her to see imperfections in her original ques- 
tions and indicate desirable modifications or substi- 
tutions. With these standards in mind, and preferably 
actually written out and before her eyes, she should 
proceed to score the papers. As variations in pupils’ 
answers are encountered and considered right or 
wrong or given a certain number of points out of the 
total number allowed for the question, she should make 
brief notes of what she has done so that if similar cases 
are encountered later she will be sure to be consistent 
in her scoring. For example, if, on a geography test, 
pupils have been asked to name all the continents and 
oceans she should note, if she has not previously de- 
termined it, the amount of credit she allows the first 
pupil who names all but one, the first who names all but 
two, and so forth, so that when the papers of other 
pupils who did the same thing are encoimtered she 



88 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

will be certain to give them the same number of points. 
It is preferable to prepare in advance a scale of points 
of credit to he given. Thus a teacher might in the in- 
stance just mentioned decide to take off of the max- 
imum score on the question one point for each ocean 
or continent omitted. 

In scoring discussion examinations it is usually ad- 
vantageous to handle one question at a time, that is, 
to go through all the pupils’ papers, scoring their 
answers to the first question, then again for the second 
question, and so on. The chief immediate advantage 
of such a procedure is that the teacher is not obliged 
to hold as many different matters in her mind at one 
time as when she scores all of the answers on one 
pupil’s paper, then all on that of another pupil, and so 
on. By concentrating on one question at a time she 
can much more easily keep the same or approximately 
the same standards in mind while scoring all answers. 
It is true that the method just recommended requires 
somewhat more handling of papers, but so doing con- 
sumes very little time, since papers can be passed from 
one pile to another as the first question is being scored, 
back again for the second one, and so on alternately. 
The slight additional amount of time required to do 
so is fully compensated for by the gain mentioned 
above. On examinations composed of items to which 
the ansij^ers are short and in general definitely right 
or wrong, it is probably best not to use this method. 
In such cases, there is very little difficulty in holding 
in mind the correct answers if the number of items is 
small, whereas if it is large a prepared list of answers 
is usually employed. Moreover, if there are many items 
the time required to shift the papers from one pile to 
the other becomes so considerable, since many shifts 



SCORING PUPILS’ RESPONSES 


89 


must be made, that this method is made too laborious. 

A method of scoring whieh has sometimes been 
recommended is the sorting method. According to it a 
teacher not only reads the answers to one question at 
a time, but also for each question sorts the papers into 
a small number of piles, usually not less than five nor 
more than ten, before determining the amount of credit 
to be allowed for each pupil’s answer. The answers 
placed in each pile are supposed to be of approximately 
equal merit. In connection with this plan it is usually 
stated that after the papers have been sorted into piles, 
those in each pile should be compared with those 
in the ones next to it and any possible mistakes in the 
first sorting corrected. When the sorting has been com- 
pleted, the same amount of credit is given all the an- 
swers in each pile. Although the sorting method pos- 
sesses certain advantages, the chief being that it tends 
to increase the reliability of marking, yet the time 
consumed is so much greater than by the ordinary 
method that it can hardly be recommended for general 
use. In some cases, especially those in which there are 
not a number of questions each of which must be scored 
separately but only one, this may be used without such 
an excessive time cost. For example, if pupils have 
been asked to write a 200-word biography of some 
historical character their efforts may well be sorted 
into a few piles as the basis of the marks assigned. 

A teacher should adopt certain general rules to be 
followed in the scoring of all examinations and also 
certain special ones for use with each particular test. 
The general rules should concern such matters as how 
mistakes in English, lack of neatness and so on are to 
be treated, whether in mathematics absolute accuracy 
will be required for any credit or partial credit given 



90 TEADITIONAL EXAMINATIONS AND NEW-TTPE TESTS 

for correct principle or partially covered computa- 
tions, and so forth. Indeed, such rules may well be 
adopted by a department in a school or, even better, by 
a whole school. In at least some of these general mat- 
ters, uniformity is desirable. The rules adopted should 
be clearly understood by pupils as well as teachers. In 
addition to following these general rules each teacher 
should decide exactly how she is going to score each 
test or examination which she gives. At the time of con- 
structing it she should determine the number of points 
to be allotted to each question or item, how to score 
partially correct answers, how much to count on speed 
and how much on accuracy, and so on. The prepara- 
tion of such rules, both the more general ones 
and those for particular tests, will usually have 
considerable effect in making examination marks more 
reliable, that is, in reducing the variability between 
those which would be given by different teachers 
or by the same teachers at different times. 
Kelly (41, p. 83) describes an experiment which in- 
dicates this. A number of fifth-grade teachers gave 
their pupils the same arithmetic examination and 
each scored the papers for her own pupils. A 
set of rules to be followed in scoring these papers 
was then prepared and all the teachers rescored them 
following these rules. The sets of marks in each case 
were then compared with marks given by the teacher 
who prepared the rules, a teacher who was known to be 
unusually systematic in her scoring of papers. It was 
found that the decrease in variations was very great. 
In the marking without rules, 3 per cent of the 
teachers varied from the marks given by the one taken 
as standard by more than twenty points, about 16 per 
cent by more than ten points and less than 6 per cent 



SCORING PUPILS’ RESPONSES 


91 


agreed exactly. The marks given according to the set 
of uniform rules varied by more than ten points in 
only 1 per cent of the eases and were in exact agree- 
ment with the standard in about 63 per cent of the 
cases. 

The statement made above that general rules should 
be adopted concerning what should be done about mis- 
takes in English, such as capitalization, punctuation, 
grammar and spelling, poor handwriting and general 
lack of neatness, and so forth, raises the question as 
to just what these rules should be. It is not an uncom- 
mon practice for teachers to make deductions for such 
faults as those mentioned. Sometimes a definite amount 
is taken off for each error, sometimes a less definite 
plan is followed. The writer does not believe that any 
deduction should be made, but rather that a pupil’s 
mark in a given subject should be based upon his 
ability and achievement in that subject and should in 
no way depend upon how well he can ■write, spell, 
punctuate, capitalize, and so forth. Only in language 
would he permit these items to affect a pupil’s mark 
directly. 

It is not meant to imply that emphasis should not 
be placed upon neatness, good penmanship, and correct 
English in all subjects. Teachers should call attention 
to such errors upon pupils ’ papers, but not make defi- 
nite deductions from the scores for them. X standard 
should be set and no paper, whether written on an ex- 
amination or under any other circumstances, accepted 
which falls below this standard. If practicable, and it 
frequently is, pupils may be given the opportunity to 
reeopy such papers, putting them in correct or at least 
better form, and then have them scored. If this is im- 
practicable or if they do not recopy them satisfactorily 



92 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

teachers should refuse to score them. Experience in- 
dicates that this procedure will have at least as good 
an effect towards securing good English and so forth 
as the other plan and in addition will make the marks 
given in subjects other than English more definitely 
indicative of achievement therein. The standards which 
must be met before papers are accepted should of 
course vary with the maturity of pupils, increasing 
from grade to grade. In some cases they may be more 
or less objectively defined, though in others this is 
difficult. Thus in the case of handwriting, a quality 
equal to a given specimen or value on one of the stand- 
ardized scales in that subject may be required. The 
spelling standard may be in terms of errors per 100 
words, as also may be those in punctuation and other 
particulars. In other respects the standards will be 
more or less subjective and have to be determined by 
the teacher’s judgment. 

It has been suggested that on finals and other rela- 
tively important examinations what is sometimes called 
the committee system of marking be employed. This 
system consists of having a number of teachers, usually 
as many as there are questions contained in the exami- 
nation, work together. Each teacher marks all of the an- 
swers to one question written by the pupils of all the 
teachers. The chief advantage of this plan is that more 
reliable marks are secured. This results because each 
question is marked more nearly upon the same basis 
than if scored by a number of teachers and also because 
the tendencies of some teachers to mark high and of 
others to mark low more or less balance and thus give 
a more reliable total score for each paper. 

Although most persons who have mentioned the mat- 
ter have favored this plan, there have been some oh- 



SCORING PUPILS* RESPONSES 93 

jections raised to it. Perhaps the best statement of 
these is by Thoma (86) who gives six arguments 
against the procedure described above. In the first 
place, she does not believe that it gives fair marks. 
Since each teacher has her special points of emphasis, 
it is fairer for her to mark the performance of her 
own pupils in the light of these points than for others 
not familiar with her teaching to do so. In the second 
place, she states that excessive time and strain are re- 
quired by this method with no compensating results. 
Third, she maintains that ill-feeling is engendered be- 
tween teachers working together when they learn that 
they differ markedly from one another in their stand- 
ards of marking. Further, she believes that pupils 
prefer to be marked by their own teachers, even though 
they consider them hard markers, rather than to have 
other teachers participate therein. In the fifth place, 
teachers should read the entire examination papers of 
their pupils in order to discover their errors and weak- 
nesses, as well as their strong points, and thus learn 
the general efficiency of instruction and what is needed 
in the future. Finally, Thoma objects to the monotony 
of long continued rating of the same question. 

It cannot be denied that at least some of these ob- 
jections have a considerable degree of validity, though 
others are of little merit. It appears that the most valid 
of them is the next to the last, that teachers do not 
gain such an intimate and helpful acquaintance with 
their pupils’ work as they do by rating all of their 
papers themselves. There is no doubt too that more 
time is required by this method, partly because of the 
waste in preliminaries and getting organized and 
partly because it rarely occurs that each question re- 
quires about the same amount of time to score. There- 



94 TRADITIONAX EXAMINATIONS AND NEW-TYPE TESTS 

fore, if teachers are working together some must re- 
main idle a portion of the time waiting for the teachers 
who are scoring the questions which require long 
periods of time to pass the papers on. It seems well to 
summarize the matter thus: Occasionally, when a 
fairly important examination is given with the chief 
purpose of securing highly accurate ratings of the 
pupils’ abilities or achievements this system may be 
used. In such instances all the teachers participating 
should be' actually teaching the subject or subjects 
covered by the examination. In the case of most ex- 
aminations and tests prepared by teachers, however, 
it is better for an individual teacher to score the papers 
of her own pupils. As has been suggested elsewhere she 
may with profit occasionally exchange papers with 
some other teacher, preferably not always the same 
one, and each score the other’s papers and then com- 
pare results. This will serve to call her attention to 
points in which her scoring may well be modified. 

Another procedure which has been suggested ac- 
complishes in part the same result as committee mark- 
ing of papers. It is that teachers should not know at 
any particular time whose paper they are marking. To 
accomplish this end it has been proposed that pupils 
place their names where they will not be seen until 
after the papers have been marked or perhaps do not 
place their names on their papers at all, but employ 
some private identifying mark, and after they have 
been rated make the authorship of each paper known 
to the teacher. A somewhat similar practice sometimes 
urged is that a teacher regularly mark the papers of 
some other teacher’s pupils and the other teacher those 
of the pupils of the first teacher. It cannot be doubted 
that anonymous marking according to either of the 



SCORING PUPILS’ RESPONSES 


95 


two plans just mentioned is likely to increase the ac- 
curacy and objectivity of the marks given. If the marks 
are to be used entirely or primarily as exact measures 
this is perhaps desirable, though it should not be over- 
looked that there may be very practical difficulties in 
the way of carrying it out. However, marks are often, 
probably most often, used for diagnostic or remedial 
purposes, for determining promotion, and so forth. For 
such purposes pupils’ achievements should be judged 
in the light of other known facts concerning them and 
the teacher’s decision should not be an absolutely im- 
personal action but rather should take into considera- 
tion the individuals concerned, though of course not to 
the extent of being dictated by mere sympathy for 
them. 

Whenever papers are to be returned to pupils — and 
this should include practically all cases except final ex- 
aminations and those when possible — ^teachers should 
be careful to indicate by marks which the pupils under- 
stand readily all instances in which full credit is not 
allowed. Ordinarily these marks should be such as to 
serve merely to point out the places in which the errors 
occurred, and perhaps the types of errors, but not to 
give the correct answers. After the papers have been 
returned, which should be as soon after the test as 
possible, pupils should be required to examine their 
papers, discover the errors they have made and cor- 
rect them. In some cases it is desirable to have this 
done in writing and the corrected papers handed in. 
In others the matter can be handled satisfactorily in 
general class discussions which deal with all errors 
commonly made, and give the correct answers. In the 
cases in which the pupils score their own papers, the 
same procedure should follow the initial scoring as 



86 TEABmoiTAL EXAMINATIONS AND NEW-TYPE TESTS 

when the teacher has done so. If properly handled, this 
is one of the most helpful features of examinations. 

One minor but important point in marking papers 
is that the teacher should employ pencil or ink of a 
distinctly different color than that used by the pupils 
so that her marks and corrections will not be over- 
looked easily. A red pencil is nearly always satisfactory 
from this standpoint, likewise a blue one when pencil 
or black ink has been used. If some of the pupils have 
employed blue ink, however, the teacher should be 
careful not to use a blue pencil in correcting. 

In connection with new-type tests, it is almost al- 
ways decidedly economical of time to prepare a written 
list of answers in such a form that the list can be 
placed just beside the answers on each pupil’s paper. 
This involves the setting up of the test so that it is 
easy to prepare a list of answers which can be so 
placed. On a true-false test, for example, it is better 
to have the truth or falsity of each statement indicated 
by a response at the left-hand side of the paper just 
before the statement so that the answers will form a 
straight column, rather than at the ends of the state- 
ments, in which case the responses would appear in 
irregular position on the paper. 

When pupils score test papers it is sometimes desir- 
able to have them score their own papers, and some- 
times to have them score one another’s. In almost all 
cases, the teacher should in some way cheek up on the 
accuracy of scoring done by pupils. When the scoring 
is done by others than the writers of the papers it can 
usually be rather satisfactorily done by having the 
papers passed back to the pupils who wrote them, 
and repeating the correct answers again. In some cases, 
the teacher should collect the papers and then score 



SCORING PUPILS’ RESPONSES 


fl7 

them herself, partially to determine the accuracy of 
scoring by pupils and partially to make the pupils feel 
that their scoring is likely to be examined by the 
teacher and therefore to stimulate them to be careful 
in this regard. 

3 . Changing scores into marks. One very important, 
perhaps the most important, principle in connection 
with rating pupils’ answers is that the teacher make 
a distinction between scores and marks. A score should 
be thought of as the number of points credit given a 
pupil on a particular examination, test or other piece 
of work. A mark is the transmutation of a score into 
terms of the marking system used in the school in ques- 
tion. For example, if pupils are given a list of twenty 
words to spell their scores are ordinarily the numbers 
of words spelled correctly. In other words, pupils who 
spell all the words correctly receive scores of twenty, 
those who spell all except one receive nineteen, and 
so on. Their marks, however, are determined by chang- 
ing these scores into terms of the marking system em- 
ployed. If it is a percentile one it is very likely that 
each score will be multiplied by five or, in other words, 
5 per cent counted on each word. If it is a letter system 
certain of the best scores, perhaps only twenty or per- 
haps twenty and nineteen, will be considered A or 
whatever the highest letter is; others, possibly seven- 
teen and eighteen, will be considered B, or the second 
highest letter; and so on. 

‘ It is especially important to keep the distinction be- 
tween scores and marks in mind in cases in which the 
percentile marking system is used and in which the 
scores also are computed on the basis of one hundred 
points. A clear illustration of this may be given by 
using a standardized scale as an example. The speci- 



98 TRADITIONAIi EXAMINATIONS AND NEW-TYPE TESTS 

mens on the A3nres Handwriting Scale range in value 
from 20 to 90 and it is apparent that they are essen- 
tially percentile values, that is, that a value of zero 
would represent a performance which had no value as 
handwriting and one of 100 perfect handwriting. If 
samples of writing from the lower grades are rated by 
the scale, it will be found that most of those from the 
third grade receive scores of 20, 30, 40 and per- 
haps 50. It would be absurd, however, to give third- 
grade pupils percentile marks that low, since writing 
of quality 70 is generally accepted as satisfactory for 
pupils completing the eighth grade. The proper marks 
to be given the third-grade pupils should be determined 
by comparing their handwriting with standards which 
have been set as appropriate for pupils of that grade. 
The same principle should be applied in all cases ; that 
is, marks should be determined by comparing scores 
with standard scores or norms, to some extent -with 
whole distributions of scores, and with the objectives 
of instruction. Needless to say, such other factors as 
the intelligence of the pupils and points of special 
emphasis by the teacher will enter into the determina- 
tion of marks. 

Before scores can be changed into marks, it is neces- 
sary not only that a marking system which is definite, 
in so far as the symbols employed are concerned, be 
in mind, but also that there be a definite plan of assign- 
ing marks so that it is known just what each symbol 
employed means. Elsewhere in this volume "will be 
found a discussion of whether or not the normal or 
some other more or less fixed distribution of marks 
should be adopted and followed. If any distribution 
is adopted the transmutation of scores into marks must 
necessarily be such as to yield results in harmony with 



SCORING PUPILS’ RESPONSES 


S9 


that distribution. A number of methods have been pro- 
posed and used by which scores when tabulated in a 
distribution may be changed into marks according to 
a given system and distribution. In some cases these 
methods provide for so doing with a high degree of 
accuracy, in others they are only approximate. Such 
methods as the ratio method, the percentile method, 
the equivalent score equation method, and the T- or 
M-score method are all intended to accomplish this 
purpose. In general these methods are statistically 
valid and give satisfactory results, but are too elabo- 
rate for ordinary use and it cannot be expected that 
a teacher will take the trouble to employ them in con- 
nection with her regular tests and examinations. 

Because of the difficulty of using the methods Just 
mentioned, it is recommended that a much simpler one 
be employed. According to this the scores should first 
be tabulated. For a single class of ordinary size, this 
merely means that they should be arranged in order 
from smallest to largest. For a larger number of 
pupils, let us say 50 or more, they should be tabulated 
in a frequency distribution, that is, grouped into from 
10 to 20 classes, according to their naagnitude. After 
they have been so arranged, whether in a simple series 
or in a grouped distribution, the teacher should make 
a rough inspection of the scores to determine if they 
show any tendency to fall into groups more or less 
similar to the per cents or proportions of different 
symbols called for by the marking system used. She 
should then, starting in at one end or the other of the 
distribution, determine the marks to be given for the 
various scores. This determination ought not be be 
based upon any absolutely rigid or hard and fast ap- 
plication of an arbitrary distribution, but should inter- 



lOQ TRADITION-AL EXAMINATIONS AND NEW-TXPE TESTS 


pret such a general or ideal distribution in terms of 
any conditions, such as relative ease or difficulty of the 
examination, intelligence of the pupils, and so forth, 
which, in her judgment, should affect the marks given. 
In deciding upon the ffividing line between different 
marks she should attempt to have it faU at gaps or 


135 

133 

130 

126 

T21 

120 

120 

120 

120 

119 

119 

116 

-116 

114 

-112 

109 

108 

107 

106 

104 

104 

104 

103 

102 

97 

96 

96 

93 

*86 

79 

76 


breaks in the distribution of scores. For exam- 
ple, if an A, B, C, D, and E system of marks is 
used she should attempt to place the dividing 
lines between A and B, B and C, and so forth, at 
points where there are gaps in the array of 
scores. Although it will frequently be found im- 
possible to do this, it can be done to at least a 
limited extent in many cases. 

The process just described can probably be 
made much clearer by an actual example. The 
accompanying are the actual total scores given 
the papers of a class of 31 individuals on a par- 
ticular examination. They have been arranged 
in order from largest to smallest. The total num- 
ber of points allowed on the examination was 
150. The marks used were A, B, C, D, and E, 
of which the first four were passing marks and 
E a failing mark. The distribution of marks ^ 
to which the teacher intended to adhere ap- 
proximately was as follows: A’s, 10 per cent; 
B’s, 30 per cent; C’s, 40 per cent; D's, 15 per 
cent ; and E ’s, 5 per cent. 

Beginning at the top of the list, that is, with 
the high scores, it is seen that there are three of 
130 or above, next a rather isolated one at 126 
and then a number fairly close together at 121 
and below. It would appear, therefore, that the 


iTb.e reader should understand clearly that the writer does not in any sense 



SCORING PUPILS’ RESPONSES 101 

dividing line between A ’s and B ’s might well be drawn 
either between 130 and 126 or between 126 and 121. 
In other words, it seems rather clear that the three 
scores at 130 or above should be considered as A’s, 
but doubtful what mark should be given as equivalent 
to 126. In this particular case, the teacher concerned 
gave only three A’s,. and called 126 a B, thus giving 
almost exactly 10 per cent of the class A’s. Consid- 
ering next the scores at 121 and below, we find seven 
closely grouped at that point, 120 and 119. It seems 
evident that these seven in addition to the one at 126 
should be considered as B’s. Below 119 there is a gap 
of three points to two scores at 116 and then below 
these a gap of two points to one at 114. The chief doubt 
here appears to be concerning the two scores at 116, 
and perhaps also the one at 114, whether to call them 
B’s or C’s. As was stated above, the teacher in ques- 
tion intended to give somewhere near 30 per cent of 
the class B’s. In this case, 30 per cent of the class 
equals 9.3. If the two scores at 116 are not included 
there will be eight B’s; if they are included there will 
be ten, and if the one at 114 is added, eleven. Because 
ten is nearer 9.3 than either eight or eleven, the teacher 
in this ease considered 116, but not 114, as being a B. 
Going on down there are eleven more scores above 100, 
and also a break of five points from the lowest of these 
to the next highest. The teacher therefore called these 
eleven C’s, although this did not give quite the 40 per 
cent which she had in mind. The five scores from 97 
down to 88, below which there is a distinct break of 
nine points, were called D’s and the two lowest ones, 
79 and 76, E’s. 

wish to set up the distribution of marlos given above as ideal, but is merely 
using the actual distribution which a certain teacher had in mind as an eammple. 



102 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

The question has probably already arisen in the 
mind of the reader as to whether it is not much easier 
to make such a transmutation of scores to marks by 
the more or less informal method just described when 
a letter system embracing only a few marks is used 
rather than a percentile marking system. There is 
no doubt that this is true. However, it is not unduly 
difficult to change point scores into percentile marks. 
Although, as is stated more fully elsewhere, the writer 
is opposed to the ordinary percentile marking system 
and favors the use of a small number of letters or other 
symbols, he will proceed to describe how this same 
series of scores may be changed into percentile marks 
by the same general procedure as was used above. 

The steps in making such a transmutation are much 
the same as those illustrated above except that some 
additional labor is necessary. First, the teacher should, 
as in the previous case, have in mind an approximate 
ideal distribution of marks which will indicate how 
many should fall in each of a few large groups. For 
example, assuming that 70 per cent is the passing 
mark, she may have in mind the following distribu- 
tion: 90 to 100 — 15 per cent; 80 to 89 — ^35 per cent; 
70 to 79 — 40 per cent ; 0 to 69 — ^10 per cent. She would 
then proceed as in" the former example to fix the divi- 
sion points between these four classes or groups of 
marks. Fifteen per cent of the class is 4.65, so she 
would probably give four or five marks of 90 or above. 
Since there is a fairly wide gap between 126 and 121, 
but none immediately below the latter, she might de- 
cide that the four highest scores should be considered 
equivalent to marks somewhere in the 90 ’s. Thirty- 
five per cent of 31 is 10.85, so she might well take the 
next eleven scores, the lowest of which is 112, as being 



SCJORING PUPILS’ RISPONSES 


103 


equivalent to marks from 80 to 89. Forty per cent of 
31 is 12.4 and since there are thirteen scores on down 
to and including 93 these thirteen may well be con- 
sidered as equal to marks from 70 to 79. The three 
lowest scores are thus left to be considered as failing 
marks below 70. 

The next step is to go through the scores again 
changing each into a definite percentile mark. Since a 
perfect score on the test was 150 and the highest point 
score made only 135, the teacher would probably not 
want to give the pupil who made it much if any higher 
than 95 per cent as a mark. Assuming that this was 
done the next three scores might each, in order, be 
called 1 per cent lower. By so doing, 133 becomes 94 
per cent, 130, 93 per cent, and 126, 92 per cent. Per- 
haps it would be better to call the latter 91 per cent 
since the difference in score between it and 130 is four, 
whereas the other two differences were only two and 
three. It has already been decided that the scores 
from 112 to 121 inclusive will equal marks from 80 
to 89 per cent. Sinee the difference between the 
smallest and largest score in this group is nine, just 
the same as between the lowest and highest mark, the 
score of 121 can be equated to 89 per cent, which is the 
highest mark in this interval, and the other marks 
found by subtracting from 89 per cent as many points 
as the corresponding scores are less than 121. Thus, 
a score of 120 becomes equal to a mark of 88 per cent, a 
score of 119 to 87, of 116 to 84, of 114 to 82, and of 112 
to 80. In the next group, which contains scores from 
106 down to 93, the range in scores is almost twice as 
great as that in per cents and one may transmute them 
by assuming that two points of score are roughly equal 
to 1 per cent. Thus, 109 and also 108 may be considered 



104i TBADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

as equal to 79 per cent, 107 and 106 as 78 per cent, 104 
as 77, 103 and 102 as 76, 97 as 73, 96 as 72, and 93 as 70. 
It will be seen that in this case perhaps more than in 
the case of the higher scores various teachers would 
differ in the marks assigned. Instead of doing just as 
has been done one may call 109 equal to 79 per cent, 
108 and 107 to 78 per cent, 106 to 77 per cent, 104 to 
76, 103 and 102 to 75, 97 to 72, 96 to 71, and 93 to 70 
per cent. It would be difl5cult to say which one of the 
two is the best transmutation although the writer be- 
lieves that the former is slightly superior. Finally, the 
three lower scores may be turned into marks by call- 
ing 88 66 or 67 per cent, 70 perhaps 60 per cent, and 
76 perhaps 58 per cent. Another might call them re- 
spectively 68, 62 and 60 or some other per cents. 

Perhaps the one fact to be most strongly empha- 
sized in connection with the transmutation of scores, 
whether into a few symbols or into percentile marks, 
is that it cannot be done by an absolutely or even ap- 
proximately automatic or mechanical process without 
considerable computation and labor. A considerable 
amount of good judgment and well-formed opinion 
must enter into and direct the transmutation. For ex- 
ample, the changes into marks made above have been 
based on the general assumption that the class and its 
achievement might be considered as typical or average. 
If, however, the teacher had known from past expe- 
riences or from the scores of other sections of pupils 
carrying the same work that the scores made on the 
examination in question were unusually low she would 
probably have transmuted them into marks in such a 
way that few if any A’s or marks of 90 or higher would 
have been given. On the other hand, if she had known 
that the examination was unusually difficult and that 



SCORING PUPILS’ RESPONSES 


105 


the class did better than the average she might 
have given more higher marks and fewer lower ones. 
Whether or not she would have been justified in either 
case is discussed elsewhere in this book, 

A question which frequently arises and has already 
been briefly referred to is that of determining what 
mark to give the one or few pupils who score highest 
even though this score is not perfect or even very near 
perfect. Without entering into any detailed discus- 
sion of the matter it seems best to recommend that if 
a class is approximately an average group or better 
and if it contains at least as many as a dozen or fifteen 
pupils and if only a few symbols are employed, the 
highest score or scores should usually be changed into 
A or whatever the highest letter or symbol in the mark- 
ing system is. If the percentile plan of marking is used 
it should be converted into a mark of 95 or above. Oc- 
casionally it may be evident that the class as a whole 
has done so poorly, possibly because of lack of prep- 
aration, that no one should receive a mark this high. 
In case plus and minus signs are used in connection 
with a letter or symbol system, A-f or whatever is the 
very highest mark employed should be saved for in- 
stances of practically perfect work or when one or per- 
haps a very few individuals do decidedly better than 
all other members of the class. That is to say that A 
rather than A-)- should usually he the highest mark 
given in a class of ordinary size. 

It has been the general implication of the preceding 
discussion that a set of scores should be turned into 
marks which when tabulated conform more or less 
closely to an ideal or standard distribution. As has 
been stated elsewhere in this book, a teacher should 
have such an ideal distribution in mind, but she should 



106 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


not follow it slavishly nor attempt to make her marks 
conform exactly to it in all cases. In this connection 
the point should be made that she may well use more 
freedom and depart further from such a standard dis- 
tribution in the marks given upon any single test or 
examination than in those at the end of the course. 
Indeed, the marks assigned in the example above fol- 
low the general or ideal distribution more exactly than 
would probably occur in most instances. If such de- 
partures from the ideal distribution are not consist- 
ently in the same direction, but rather variable, it will 
be found that the distribution of average marks for 
the semester or term will probably approximate the 
standard distribution much more closely than does 
that from a single test. 

It was stated above that a number of writers have 
suggested and elaborated rather complicated and ex- 
act methods of changing scores into marks, but that 
most, if not all, of these require too much labor to be 
satisfactory for ordinary use. Some methods have 
been suggested, however, which can be used without 
so much labor and which their authors at least believe 
are superior to the one described in the text. One of 
these which French ( 26 ) and others have suggested 
applies only in case percentile marks are used. It pro- 
vides that after the scores have been arranged in order 
a pupil be chosen whose score is perhaps from third 
to fifth below the highest one made, and this pupil’s 
paper considered as deserving the mark reported for 
his usual or average school work. Similarly, a pupil, 
preferably one whose work is barely passing, is chosen 
near the lower end, and his score considered as equiva- 
lent to the passing mark. If we suppose , that the first 
pupil is rated at 95 per cent and the second one at 70 



SCORING PUPILS’ RESPONSES 


107 


per cent or passing, we have an interval of 25 points 
on the percentile scale. This interval, whether it he 25 
or some other number, is considered as equal to the 
interval in point scores between the same two pupils’ 
papers. If, for example, the first pupil scored 75 points 
•and the second 35, the point score interval is 40. This is 
divided by 25 and the quotient, 1.6 score points, con- 
sidered to be equivalent to 1 per cent. On this basis all 
the scores are turned into per cents, working down 
from the higher point or up from the lower one. Thus a 
pupil who scores 63 points makes 12 less than 75. Di- 
viding 12 by 1.6 gives 7.5, which is the number of per 
cent his mark is less than 95. Therefore he receives 
87.5 per cent. The same mark may also be obtained by 
subtracting 35 from 63, dividing the result by 1.6 and 
adding the quotient to 70 per cent. 

This method is sometimes, perhaps more often than 
not, modified by starting with the paper receiving the 
highest score and determining the corresponding per- 
centile mark for that, and also using the one with the 
lowest score instead of the one just barely passing. 
The rest of the procedure is the same. It is not ap- 
parent that this method is preferable to the one which 
has been described at more length, yet on the whole it 
should be rated as a fairly good one. 

4 . Summary. Several different plans of weighting 
examination questions have been practised and urged, 
no one of which can be said to be superior to the others 
for all purposes. For ordinary class-room tests of the 
traditional type it is recommended that the questions 
be weighted as nearly as possible in proportion to 
their importance, or social value, whereas for new- 
type tests containing a fairly large number of items, 
it is probably best to count the same number of points 



108 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

upon each item in any test or portion of a test contain- 
ing items all in the same form. Pupils ’ answers should 
he carefully scored with well worked out standards 
of scoring clearly in mind. On most discussion ex- 
aminations it is best to mark the answers to one ques- 
tion at a time. The often recommended sorting method 
is probably too expensive in time consumed for or- 
dinary use. A system, a school, a department, and also 
a teacher should adopt certain definite rules for guid- 
ance in scoring and marking pupils’ answers. One of 
these should be that in other subjects than English 
the mark given does not depend upon quality of lan- 
guage or handwriting. The committee system of mark- 
ing papers may be used occasionally on unusually im- 
portant examinations, but should not be employed 
regularly. Each pupil’s paper should contain such 
marks and comments as will enable him to locate all 
errors. It is important that a distinction be made be- 
tween scores and marks. Ordinarily papers should first 
be scored and the scores transmuted into marks ac- 
cording to some fairly definite plan. For an ordinary 
class this plan can be more or less informal, based on 
certain general principles. In any event it should not 
be an automatic or mechanical process but should be 
guided by good judgment and common sense. 



CHAPTBE V 


THE MARKING SYSTEM AND ITS MEANING 

I, Should marks be used at all? Since the last chapter 
closed with a discussion of how scores are to be con- 
verted into marks, it is fitting that the marking system 
he treated next. Therefore this and the succeeding 
chapter will be devoted to the consideration of what 
marking symbols and system should be used, what 
they should mean, how the marks should be distributed, 
and related matters. So many conflicting opinions have 
been advanced, so many diverse practices have existed, 
and so many controversial questions have been raised 
that a large volume could be filled with a critical con- 
sideration of the arguments advanced to support and 
to oppose different views. No attempt will be made to 
deal with the problem in such complete fashion, but 
the relatively important questions will be stated and 
dealt with at sufficient len^h to state the leading argu- 
ments on both sides and suggest answers. No elaborate 
defense of the recommended procedures and practices 
will be given, though some arguments will be offered 
to justify them. 

■ In the first place, the vital question of whether marks 
in the ordinary sense should be employed at all or not 
has been raised from time to time. Usually this has 
been in the somewhat modified form of whether there 
should be more than two marks, such ^ passing and 
not passing or satisfactory and uns^sfactory. Only 



110 TRADEEIOlirAL EXAMINATIONS AND NEW-TYTE TESTS 

very rarely has any one maintained or even suggested 
that absolutely no marks or ratings be given pupils, 
but a number of persons have urged that only two be 
used. Although it is true, as has often been pointed out, 
that there are certain undesirable results which fre- 
quently come from most or even all of the marking 
systems in use, the writer, in common with an ap- 
parently overwhelming majority of teachers and others 
interested, does not believe that systems employing 
more than two marks should be abolished in elementary 
and high school. Most of the undesirable results, such 
as overwork, cheating and other bad practices entered 
upon in the effort to secure higher marks, the discour- 
agement and unhappiness frequently resulting from 
low marks, the invidious feelings which often issue 
from the comparison of marks by those who receive 
them, and so forth, can be largely if not entirely 
avoided by a satisfactory system of marks and its 
proper use. In other words they are not necessary re- 
sults or concomitants of marking systems, but may be 
largely avoided by taking the proper precautions. 
Some of them may be unavoidable (to a minor degree), 
but they would also probably result from even a two- 
division marking system. At least some of those who 
are rated as failing or unsatisfactory, no matter how 
it is done, will always envy those who succeed, be dis- 
couraged and probably attempt to cheat. Others will 
resort to unethical or otherwise undesirable means of 
improving teachers’ opinions of their work whether; 
marks are given or not. 

Marks are not justified, however, by the mere fact 
that they do not produce many or great undesirable 
results, but because they yield certain positive advan- 
tages which are much greater than their vicious out- 



THE MARKING SYSTEM AND ITS MEANING 


111 


comes. A very good statement of the purposes of mark- 
ing is that of Wood (94, pp. 120-121). Although he 
is dealing primarily with the marking of college stu- 
dents his purposes may be applied with slight if any 
modifications to that of elementary and high-school 
pupils as well. His statement is as follows : ^ 

^‘The pui^ose of grades in secondary schools and colleges 
may be divided into two fairly distinct classes: 

(a) Pedagogical, and 

(b) Administrative. 


^‘(a) PEDAGOGICAL 

^‘Grades may contribute to a student ^s education either 
directly by calling forth his effort in answering the questions 
and in performing the tasks most visibly used in deriving 
his grades, or indirectly by forcing him to study effectively 
in view of prospective examinations, or by enlisting his co- 
operation through the incentive and competitive possibilities 
of just and fair marks of his ability or achievement, 

*‘(b) ADMINISTRATIVE 

*'On the administrative Side we may enumerate various 
plausible reasons for grades in high schools and colleges: 

(1) To inform parents or guardians of the educational 
status of their children or wards. 

(2) To give information as to fitness of students for higher 
or differentiated schooling. 

(3) To enable an institution to give a relative standing 
to students and so determine those to whom credit or degrees 
or honors should be granted. 

(4) To furnish records sufficiently meaningful and accu- 
rate to be used in researches looking to vocational guidance 
and vocational education, to an appraisal or comparison of 
the efficiency of different school systems, of different instruc- 

iThis statement is quoted by permission from Wood, B. 1>., Ueamrememt in 
Eiffher JSduoation, copyright, 1923, by World Book Company, publishers, Yonkers- 
on-Hudson, New York. 



112 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

tors, of different methods of instruction, and of different 
subjeet-matters, and to all questions in the resolution of 
which exact information on accomplishment is indispensable. 

“Whether we accept one or all of these as valid purposes 
of grades, it is clear that each demands a certain indis- 
pensable modicTua of accuracy now absent.” 

Although the pedagogical purposes of marks are un- 
doubtedly the more important, yet it can easily be 
seen that if marks are not available for the administra- 
tive purposes named schools are handicapped in many 
ways. Parents and guardians have at least some right 
to know how well their children are doing in school 
and to know this in fairly definite and easily under- 
standable terms. Marks are also almost if not abso- 
lutely necessary because of the need for some method 
of informing various persons interested of the quality 
and quantity of work done by pupils. When individuals 
transfer from one school to another, when they enter 
high school from the elementary grades, or college 
from high school, when they are recommended to pos- 
sible employers, and on various other occasions there 
are distinct advantages in being able to report con- 
cerning their work in brief terms which have more 
or less generally understood meanings. 

Even though it is true that sometimes xmwholesome 
rivalry and undesirable practices result from the use 
of marks this condition should not be allowed to ob- 
scure the fact that in the case of most pupils marks 
may have a helpful and stimulating effect. There seems 
to be little if any doubt that reporting marks to pupils 
does, on the whcde, increase their achievement. We may 
wish that it were not so and that pupils were willing 
to work without being marked, but in practice it is al- 
most impossible to motivate the work so that they will 



THE MARKING SYSTEM AND ITS MEANING 113 


do so. Both the general consensus of opinion among 
experienced teachers and some experimental evidence 
indicate that reporting marks to pupils results in 
better achievement than if they are not reported. It 
seems to be inborn in human nature to wish to have 
the work which we do appraised or rated and generally 
to be unwilling to continue working indefinitely with- 
out knowing something of the quality of our produc- 
tion, or at least how well those for whom we are 
producing rate it. Marks, therefore, meet this need by 
letting each pupil know what a person whom he ordi- 
narily recognizes as a more competent judge than him- 
self thinks of the merit of his work. 

2. Upon what should marks be based? There is a 
wide divergence of practice as to what elements should 
enter into the determination of marks given by teach- 
ers. In a general way it has usually been assumed that 
the marks assigned were intended to be indicative of 
the quantity and quality of pupils’ knowledge or 
achievement in the subjects in question; that is, for 
example, it was assumed that a high mark in reading 
meant that the pupil was a good reader, a low mark in 
spelling that he could not spell very well, and so forth. 
■\l^ile this assumption is in the main true, more care- 
ful analysis and investigation has shown that as a 
matter of fact many elements enter into the determina- 
tion of marks and that these elements are not the same 
with different teachers nor are they given the same 
weight. 

Among the factors which enter into determining 
marks those in the following list may be mentioned. 
This is not intended to be an exhaustive list, nor has 
care been taken to avoid some overlapping due to dif- 
ferences in wording which refer to more or less the 



114 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


same thing. Its purpose is merely to give illustrative 
and typical statements of factors which have been 
gathered from several sourcs as indicative of those 
which teachers have in mind. 

final attainment at the end of the semester or term 

attitude toward work 

degree of interest manifested 

amount of effort put forth 

general intelligence 

character and personality 

amount of improvement manifested 

quality of work done 

quantity of work done 

ability to do subsequent work in the same subject 

amount of initiative shown 

seriousness of purpose 

degree of preparation ofi study of student 

final examination 

daily class work 

all written work 

combinations of oral work, written work and examinations 
in various proportions 

These factors, if interpreted broadly, appear to in- 
clude almost all of those which teachers generally con- 
sider consciously although many more have been named 
by teachers in answer to questioimaires. Johnson, for 
example (39), asked forty-three principals and experi- 
enced teachers to state upon what they based marks 
and had forty-nine different qualities listed in reply. 
In most cases, marks are based upon a combination of 
two or more factors rather than upon one alone. Not 
only do teachers have many different bases of mark- 
ing in mind, but also even those who name the same 
factors as determining the marks they give will fre- 
quently be found to understand and interpret these 



THE MARKING SYSTEM! AND ITS MEANING 115 

factors differently, as well as to vary a great deal in 
the relative importance which they assign to them. 

The facts briefly summarized in the last two or three 
paragraphs in addition to the experience of almost 
every reader lead to the inevitable conclusion that there 
is a wide diversity in what teachers mean by the marks 
they assign. Furthermore it appears from this stand- 
point that one is scarcely justified in comparing marks 
given by different teachers- or in assuming that they 
measure the same abilities or qualities unless there 
is evidence that they have somewhat the same 
basis. 

From another standpoint also there is veiy great 
difference of practice, so that even if teachers were 
agreed upon the factors which should determine marks 
there would still be much diversity of practice as to 
their meaning. This disagreement is over the ques- 
tion of what degree or quality of each factor should 
be manifested by a pupil in order that he receive a 
particular mark. For instance, a barely passing mark 
may mean any one of the following things or various 
others ; 

mastery of minimum essentials 
fulfilling the letter of the law 
doing all that can reasonably be expected 
ability to carry subsequent work in the same subject 
completing 70 (or some other) per cent as- much work as 
the best pupil in the class or in the teacher’s experience 
completing 70 (or some other) per cent of the assigned work 
attaining 70 (or some other) per cent of perfection 
doing all required by the teacher 

The highest mark in the system, to give a second ex- 
ample, also has a number of meanings. Among these 
are: 



116 TRADITIONAI, EXAMINATIONS AND NEW-TYPE TESTS 

absolute perfection, 

the best in the class ' 

completion of all that has been assigned 

as good as any pupil the teacher has ever had 

doing aU that a pupil can reasonably be expected to do 

In both eases, the lists of meanings given are only- 
suggestive and not at all complete. They serve merely 
to indicate some of the quantitative and qualitative 
standards -which teachers have in mind. 

It is very evident that some remedial measures are 
needed if the same mark when given by different 
teachers is to have even approximately the same mean- 
ing. Therefore it is very strongly recommended that 
in every school and like-vsdse in every school system as 
a whole the question of marking be studied by all those 
who participate in giving marks and be discussed in 
teachers’ meetings, -with the result that certain general 
principles and definitions which all teachers in the 
system should follow are laid do-wn. Each group of 
teachers in the same school or system handling the 
the same work should ordinarily go still further and 
adopt a supplementary set of guiding principles suit- 
able for the particular work which they teach. This 
study and discussion should result in agreement upon 
the marks used, upon a standard or ideal curve of 
distribution, and upon a set of specifications defining 
the meaning of the various marks employed. Several 
rather good definitions of this sort have appeared in 
print, among which are those described by Masters, 
Reeder and Whitten. 

The first of these sets of standards or specifications 
is that given by Masters (51), and is the one used in 
the Beechview-Beechwood Public Schools. It will be 
noted that it is used in connection -with a five-letter 



TABLE I. STANDARDS FOR RATING PUPILS 
BXSOHVIKW'BSEOHWOOD PX7BLIO SCHOOLS * 






TABLE II. SCALE OF QUALITIES OF WORK 

PUBIilO SCHOOLS, OBNESBO, ILLINOIS* 



TABLE n. SCALE OP QUALITIES OP WORK— ConMn«ed 


2 ^ I5 gig-a oS 

S *^•’25 o .SaJ-zj .£:*3 

® .2 C5( ti "S “P® 

g- |S| gilis,j|i»“||, 
'"’S(5-§gs^.sossgf«« 




^ v 


o 2 w 2 2 

s o-os a 

.2.3 d *43 

ssg§s 

^ o/~» 

e3 EB e3 .JQ 


*2 3 

ao 


O " o,^ S ® to 

■2»^-BSS-“s 

•araS^oo’Sl 

Si"# S' 




II. Preparation, 

1. Preparation 
daily. 

2. Preparation 
done thought- 
fully. 

3. Written work 
in on time. 

4. Directions of 
the assignment 
followed as an 
outline. 

III. Attitude. 

1. In recitation. 

a) Good posi- 
tion-stand- 
ing and sit- 
ting. 

b) Attentive. 

2. Toward prep- 
aration. 

a) Ability to 
work with- 
out much 
assistance. 

b) Judgment 
in using 
time to ad- 
vantage. 

3. Good team 
work. 

II. Preparation. 

1. Daily prepara- 
tion. 

2. Insufficient 
time spent in 
preparation. 

III. Attitude. 

1. In recitation. 

a) Attention 
poor. 

b) Needs re- 
minding 
about cor- 
rect posi- 
tion. 

2. Toward prep- 
aration. 

a) Reauires 

frequent 
assistance, 
b) No imagi- 
nation or 
creative 
ability 
shown. 

8. Not enough 
regard for 
team work. 

II. Preparation. 

1. Not constant in 
daily prepara- 
tion. 

2. Preparation 
covers only 
about three- 
fourths of the 
assignment. 

III. Attitude. 

1. In recitation. 

a) Needs re- 
minding of 
correct po- 
sition. 

b) Inattentive. 

2. Toward prep- 
aration. 

a) Requires 
much as- 

sistance. 
b) Lack of im- 
agination 
or creative 
ability, 

3. No regard for 
team work, 


.§’§>= 

Sg^fd 

|w-| 

.5 pH'3 

PHidl 

CD 

u 29 
o P I .- 

U3^ 

4*45®°*=r 


*See reference 68. 


120 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

system, the letters being respectively : E for excellent, 
G for good, F for fair, IT for unsatisfactory, and P for 
poor, and that standards covermg four different phases 
of school work are given under each letter. These 
phases are knowledge of subject-matter, preparation, 
attitude, and application. 

The second set of specifications, as given by Eeeder 
(68), is that used in the Geneseo Public Schools. 
It is likewise intended for use with a five-letter 
system. For P, the lowest letter, a number of more or 
'less disconnected specifications are given, but for each 
of the other four letters those mentioned are under 
three main heads — ^knowledge of subject-matter, prep- 
aration, and attitude — each of which is divided into a 
number of subheads. The whole scale is somewhat 
longer than the one previously given and in a number 
of points goes into more detail. From the standpoint 
of content it appears to be more helpful and more likely 
to insure uniformity because it is more detailed, but 
from the standpoint of organization that of Masters is 
perhaps preferable. 

The last set of definitions referred to, that of Whit- 
ten (93), is as follows : 

“I. SCHOLAESHIP 

Preparation: 

A. Complete, regular, exhaustive supplementary reading 
exceeding expectations of teacher. 

B. Same as for A with somewhat less extensive supple- 
mentary reading and investigation. 

C. Meeting the demands and suggestions of the teacher. 
No supplementary work on own initiative. 

D. Barely covering minimum daily assignments. 

B. Careless, partM, inefficient, indifferent. 

"‘Application: 

A. (1) Attention: 100 per cent — complete concentration. 



THE MARKING SYSTEM AND ITS MEANING 121 

(2) Initiative: High grade originality and ingenuity 
in research. 

B. (1) Attention: Same as for A. 

(2) Initiative: Dependable; considerable originality. 

C. (1) Attention: Ordinary. 

(2) Initiative: Not noteworthy; requires considerable 
encouragement and aid. 

D. (1) Attention: Wavering, uncertain, feeble. 

(2) Initiative: Not appreciable; helpless on new work. 

E. (1) Attention: Negative, a disturbing factor. 

(2) Initiative: Little or none; unable to follow di- 
rections. 

^^Knowledge of Subject: 

A. Distinguished achievement; complete mastery; exceed- 
ing expectations. 

B. Superior achievement but less complete mastery. 

C. Ordinary achievement; meets teacher’s requirements. 

D. Mastery of a bare minimum for passing. 

E. Very meager, fragmentary, inadequate. 

^*Use of English: 

A. Extensive vocabulary, excellent diction, correct habits. 
Rapid comprehending reader. 

B. Same. 

C. Limited vocabulary, fair reader, errors persistent. 

D. Inadequate vocabulary, slow inefficient reader, master- 
ing barest minimum essentials. 

E. Inability to ‘read’; slovenly speech, very deficient 
vocabulary. 

Progress: 

A. So rapid as to constitute a teacher’s problem. 

B. Rapid but not disturbing. 

C. Noticeable and steady. 

D. Discouragingly slow. 

E. Inappreciable.” 

It will be seen that these specifications contain five 
points under scholarship and state the amount, degree, 
or quality of each required for the marks A, B, C, D, 
and E. The five points are preparation, application, 



122 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

kaowledge of subject-matter, use of English, and prog- 
ress. Although either one of the first two schemes 
appears preferable to this third one, the latter has 
some points of excellence and would undoubtedly be of 
considerable value in accomplishing its purpose. 

Another such list of definitions, which so far as the 
writer is aware has not as yet appeared in print, is 
that worked out and in use at the University High 
School of the University of Illinois. This is as follows: 

“Grade of A 

1. Scholarship — ^Exceeding expectations of instructor. 

2. Initiative — Contributions exceeding the assignment. 

3. Attitude — ^Positive benefit to the class. 

4. Cooperation — ^Forwarding all group activities. 

5. Individual Improvement — ^Actual and noticeable. 

“Grade of B 

1. Scholarship — ^Accurate and complete. 

2. Initiative — Stimulating some desirable achievements. 

3. Attitude — ^Proper and beneficial. 

4. Cooperation — Effective in group work. 

5. Individual Improvement — Showing marks of progress. 

Grade of C 

1. Work in general of medium quality. 

2. Work quite strong in one or more items but weak in 
others. 

“Grade of D 

(Tto grade might be produced by any variety or com- 
bination of weaknesses as the definition suggests.) 

1. Scholarship — ^Barely meeting assignments. 

2. Initiative — ^Uncertain, not usually manifest. 

3. Attitude — ^Not objectionable, usually neutral. 

4. Cooperation — Not positive nor very effective. 

5. Individual Improvement — Slight, not positive. 

Grade of E 

1. This is a failing grade and since it may result from 
any number of weaknesses is not defined.” 



THE MAKKING SYSTEM AND ITS MEANING 123 

It ■will be seen that this is shorter than any of the 
three previous sets of specifications, and therefore less 
detailed. It has, however, been found distinctly help- 
ful and has continued in use for a number of years. 

In giving these four scales or sets of specifications in 
full, it was not the intention to recommend or even sug- 
gest that some one of them be adopted by each school or 
school system, although this could perhaps profitably 
be done. It would ordinarily be much more helpful for 
each school unit to work out a list of standards of its 
o'wn after having made a study of these and other simi- 
lar ones. There would probably be considerable simi- 
larity between most of such sets of standards and one 
or more of those just quoted, but there is a distinct 
value in having the standards to be used by a group of 
teachers actually result from their cooperative labor 
and thought. Much more interest will ordinarily be felt 
in using such specifications locally produced than if 
a set is merely imported and adopted. More than this, 
however, the attention and thought given the con- 
'struction of such standards ■will be profitable to all 
those participating, and the meaning of the specifica- 
tions contained therein will be clearer than if an 
outside set is chosen. Indeed after such a set has 
once been adopted it should not be regarded as per- 
manently fiixed and unchangeable, but from time 
to time should be discussed and considered with^ a 
view to modifying and improving it if this seems pos- 
sible. 

In the ejBfort to secure a measure of pupil effort, 
industry and attitude, or some combination of these, 
various means have been proposed for reporting per- 
formance or achievement in relation to ability or capac- 
ity to achieve. The best kno'wa of these is the achieve- 



124 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


ment quotient.® This is found by expressing a pupil’s 
score on a test in age units, that is, in terms of the 
average age of pupils who make the same score, and 
dividing by his score on an intelligence test also ex- 
pressed in age units or mental age. Sonaetimes the 
divisor used is chronological instead of mental age and 
the resulting quotient ® expresses achievement relative 
to age in the ordinary sense. Although these quotients 
were originated in connection with stanardized tests 
and have received their major use in that connection, 
the suggestion has been made that they or similar ex- 
pressions be employed in connection with tests or 
examinations made by the teacher. Even if it were pos- 
sible, which it is not, to determine age or other satis- 
factory norms for each such test employed, the labor 
of doing so would be too great to render it practicable. 
Therefore, if such a procedure is to be used some 
simpler method of arriving at the desired measures 
or marks must be found. Several suggestions along 
this line have been made. 

One of the simplest and probably the best of the sug- 
gested methods requires that the pupils’ ranks on a 
particular test or examination be compared with their 
ranks on an intelligence test. The difference between a 
pupil’s rank on the two tests indicates something of the 
effort which he has put forth. This may be illustrated 
very simply by a group of five pupils, whom we shall 
call A, B, 0, D, and E. Suppose, for example, that their 

2 The ternoB €Lee<mtpliahment quotient and ixttannment quotient are sometimes 
used as synonymous with achievement quotient. All are abbreviated A.Q. 

s The quotient found by dividing achievement age by chronological age is most 
often called the subject quotient for a particular school subject, and the edAicational 
quotient tor a group of subjects combined. An occasional writer, however, has 
unfortunately used the term achievement quotient in this sense thereby causing 
some confusion of terms. Also the term ratio instead of quotient has occasionally 
been used, sometimes in the one sense and sometimes in the other. 



THE MARKING SYSTEM AND ITS MEANING 125 

order beginning with the highest on an intelligence 
test was B, C, A, E, D, and on a particular test or 
examination prepared by the teacher it was A, C, E, 
B, D. It is readily apparent that A, who was first upon 
the subject-matter test though only third upon the 
intelligence test, is putting forth much more effort than 
the average and that E, who ranked third on the sub- 
ject-matter test and fourth on the intelligence test, 
is somewhat above the average in effort. On the other 
hand, B, although first upon the inteUigenee test, 
rahked fourth upon the subject-matter test and is, 
therefore, evidently making much less use of his abil- 
ity than most of the pupils. The other two, C and D, 
maintained the same ranks on both tests and there- 
fore appear to have put forth approximately average 
amounts of effort. It is possible to employ measures 
more exact than mere ranks, but the amount of labor 
required to do so appears scarcely justified by the 
small increase in accuracy obtained, especially in view 
of the fact that scores upon any single test are rarely 
highly reliable. 

There is no doubt that some such plan as that just 
outlined is frequently of considerable service in stimu- 
lating the work of those pupils whose efforts tend to 
be below the average. It does not appear, however, 
that it would be desirable to substitute marks given 
on this basis for those which are largely if not en- 
tirely measures of actual achievement. So doing might 
have very desirable effects in the stimulation of pupils 
to harder work, but it would make it decidedly difficult 
for those who must interpret the marks to employ them 
for purposes of classification, promotion, recommenda- 
tion, and so forth. 

It has sometimes been suggested and less frequently 



126 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

but occasionally carried out in practice, that a double 
marking system be employed. Sucb a system ordinarily 
consists of one set of marks used in reporting upon 
pupils’ work to other schools and for similar purposes, 
and a second set of marks for use only within the 
school itself or perhaps also in reporting to parents. 
The first set in such a double scheme is ordinarily in- 
tended to measure achievement, whereas the second 
usually measures effort, that is achievement in com- 
parison with ability. Another plan somewhat similar 
to this is that each mark consists of two symbols, one 
of which indicates absolute achievement and the other 
achievement relative to capacity. A possible variation 
is the use of a compound mark, one element of which 
indicates achievement relative to ability of the group 
in which the pupil is placed, and the other defines to a 
certain extent the character of the group. Thus if the 
ordinary letters A, B, C, D, and E are used to repre- 
sent five degrees of achievement, they may be followed 
by the figures 1, 2, and 3, indicating respectively 
superior, average, and inferior sections. According to 
this plan a mark of A1 indicates that the pupil receiv- 
ing it is one of the few best in a superior group where- 
as a mark of A3 indicates that he is one of the best 
in an inferior group. At least two or three writers have 
gone further than this and urged a triple or even more 
elaborate marking system. It has been suggested that 
one mark or one portion of a compound mark be given 
upon achievement, one on effort, one on attitude or 
interest, one on behavior or school citizenship, and so 
forth. 

The chief general objection to any such double or 
more elaborate plan of marking is the extra amount of 
work which it entails. Teachers usually feel that it is 



THE MARKING SYSTEM AND ITS MEANING 127 

a real burden to be asked to give each pupil more than 
one mark and, unless it is very dear that a considerable 
amount of gain results therefrom, are liable both to 
complain considerably if required to do so and to give 
most of the marks with such hurried and superficial 
consideration that they are of little value. It is there- 
fore not recommended that such a multiple marking 
plan be incorporated as a regular part of school pro- 
cedure except in situations in which it appears that 
the teachers are more or less anxious to have it and 
are, therefore, interested enough to devote the requi- 
site amount of time and thought to determining and 
recording the marks. 

An important point which must be decided by any 
teacher before she determines semester or other simi- 
lar marks for her pupils is that of how much to weight 
each of the kinds or portions of work which enter into 
the determination of these marks. That is to say she 
must decide how much to count upon daily oral recita- 
tions, how much upon short quizzes, how much upon 
written reports and other work prepared outside of 
class, how much upon the final examination, how much 
upon laboratory work if there is any, and so on for 
aU of the types of work or opportunities for pupils 
to exhibit their achievement. Little or no attention was 
paid this phase of the problem in any of the lists of 
specifications quoted above. Many schools however 
have definite requirements as to what proportion of 
the semester or annual mark should depend upon the 
final examination, and some have similar rules with 
regard to short tests, note-books, and other factors. 

It does not seem best that any definite and exact 
rules be laid down to guide all teachers or even the 
same teacher at all times in this respect. The proper- 



128 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

tions should differ considerably because of differences 
both in the subject-matter covered in the course and in 
the methods of instruction used. It is probably the 
general tendency of teachers to count somewhat too 
heavily upon the final examination and also upon other 
written tests. On the other hand it is contended with 
much justice that the mark given should be representa- 
tive of the achievements of the pupils at the completion 
of the course regardless of when the ability to achieve 
was acquired and that a satisfactory final examination 
is probably the best single measure of this ability and 
should therefore count as a large proportion of the 
total mark. As opposed to this argument it is urged 
that the mere possession of knowledge at the time of 
the final examination should not be weighted too heav- 
ily since this knowledge may have been acquired by 
cramming within a comparatively short period preced- 
ing the examination, and that if this is the case it will 
not be retained as well as if it had been secured by 
study and constant work during the course. It appears 
that this fact, and the closely connected one that if pu- 
pils know that a rather large proportion of the final 
mark depends upon daily work they will be stimulated 
to regular study and to keeping their work up to date, 
more than outweigh the argument given above for 
counting heavily upon the final examination. In those 
subjects in which a fairly comprehensive and thorough 
idea of total achievement can be obtained from the 
results of a single examination, these results should of 
course count somewhat more heavily than in a subject 
in which this is not the case. Tests which are given 
primarily for diagnostic purposes, that is, to show the 
weaknesses of the class so that these weaknesses can 
be remedied, should not count very heavily since pre- 



THE MARKING SYSTEM AND ITS MEANING 129 

smnably conditions revealed by them have been cor- 
rected later in the course. 

As a general rule, it is recommended that a single 
final examination should never count for less than 10 
per cent of the mark given, nor for more than 25 per 
cent, that all written examinations and tests together 
should not count for less than 25 nor more than 50 per 
cent, that oral class work from day to day should never 
count for less than 33 H per cent, and that class work 
from day to day, both oral and written, should never 
count for less than 50 per cent. Occasionally, however, 
exceptions ought to be made to these limits. For ex- 
ample, in physical education, in which ordinarily there 
is very little oral class work, and perhaps also in such 
subjects as manual training, sewing and cooking, in 
which the amount of oral work is generally small, 
practically nothing should be counted upon oral reci- 
tation and also less upon written quizzes than in the 
ordinary so-called “book” subjects. 

In order to determine the portion of the final mark 
based upon daily work, it is the practice of some 
teachers to mark pupils on every oral recitation or at 
least on every one of more than a few seconds in length. 
They believe that by so doing they accomplish two ends, 
at least. First, they claim to secure adequate marks 
of pupil performance, and, second, to stimulate the 
pupils by making them feel that all or practically all 
they do is important and counts in determining their 
final mark. The bulk of opinion is to the effect that this 
practice is undesirable. It is practically impossible for 
a teacher to record marks for short answers or recita- 
tions of a few words each without slowing up the class 
recitation appreciably or else distracting her atten- 
tion from what is going on. Moreover it is. not neces- 



130 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


sary to do so in order to secure a snflSeient number of 
marks. Instead of marking every response it is recom- 
mended that the teacher record marks for all long reci- 
tations, that is, for all topical reports and discussions 
and other recitations in -which indmdual pupils con- 
sume several minutes or more, and in addition occa- 
sionally record marks for short recitations, thus merely 
taking a sampling of the latter. For this purpose one 
or two marks on oral work per week -will ordinarily be 
sufficient. 

Before presenting the several sets of specifications 
for marks some very brief and general suggestions 
were made as to the general program to be followed 
by a school or school system in impro-ving or recon- 
structing its marking system. Somewhat more detailed 
but still concise suggestions have been offered by Eugg 
(76, pp. 81-82). He presents the following list^ of 
steps which in his opinion are both desirable and prac- 
ticable in rebuilding our public school marking sys- 
tem. 

“1. The evident laek of reliability and consistency in 
teachers* marks, and their evident inaccuracy as measures 
of ability, demand of administrators the initiation of cam- 
paigns of education among their teachers to a recognition 
of the importance of the facts set forth. In the carrying 
on of such campaigns helpful administrative devices have 
been found to be: 

(a) The publication of the distribution of teachers’ marks 
eadh semester in open bulletins. 

(b) The discussion of these bulletins in teachers’ meetings. 

(c) Insistence that each teacher tabulate and plot graphs 
of her distribution of marks each semester before sub- 
mitting them to the office. 


4 Quoted from Bugg, Harold, A Primer of &raphice and Statiatice for Teaekera, 
Boston Hougliton MiKlizi' Odmpanj, 1925, by perinission of the publishers. 



THE MARKING SYSTEM AND ITS MEANING 


131 


(d) Requiring reading and discussion of the use of dis- 
tribution-curves in marking. 

(e) Insistence that each teacher rank her pupils prior 
to assigning final marks, whether on examination, 
^ paper, ^ ^quiz,’ or semester's work. 

(f) The appointment of departmiental committees with 
instruction to define, in detailed word-statements, each 
grade of ability represented on the marking scale. 

(g) The use of objective scales and tests in all those sub- 
jects and for all those types of subject-matter for 
which such tests and scales are now available. 

(h) The use of ‘general-ability' tests for purposes of classi- 
fying pupils, and of detecting various grades of ability 
in our pupils early in the course of instruction. 

“2. If letters are now used in a school system, certainly 
each one should be merely a symbol to signify the abilities- 
to-do which have been defined in very detailed worded 
statements, understood alike by teacher, administrator, and 
pupil, in terms of which the instruction has constantly been 
oriented. 

“3. If numbers are used they should be employed only 
as economical symbols to represent the various groups or di- 
visions of the marking scale; that is, only one number (for 
example, the median one) should be used to typify a group 
of marks (as, for example, the ‘excellent' group, the ‘A' 
group), instead of using, as is now common, 91, 92, 93, 94, 
95, 96, 97, 98, 99, 100. Thus the only numbers used would be 
such clearly separated numbers as 95, 90, 85, etc. Numbers 
in themselves would cease to have the ‘picayunish' absolute 
differences in meaning that they pretend to now, and would 
merely stand for abilities which have been distinguished 
clearly through detailed worded statements. 

“4. As a practical and helpful tool in the adequate meas- 
urement of student work, measurement by ranking with sub- 
sequent transmutation to absolute marks by means of a dis- 
tribution-curve (preferably the normal probability-curve) is 
very desirable." 

A number of superintendents, principals, and others 
have given helpful accounts of what has been done 



132 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

to standardize the marking systems in their schools. 
Although their procedure has usually tended to fol- 
low the steps outlined by Bugg, it seems worth while 
to refer to at least one actual case, described by Camp 
(12), He relates the procedure followed in the Stam- 
ford, Connecticut, High School. Although readers are 
advised to read Camp’s article in full, it seems well 
to summarize his account for the benefit of those who 
cannot do so. He states that the problem arose largely 
from the fact that only about one-half of the pupils 
were rated as doing work above passing, three-eighths 
more being rated as just at passing, and that the aver- 
age mark given was less than 4 per cent above the 
passing mark. A series of three teachers’ meetings 
was held to consider the matter. The first was devoted 
chiefly to a discussion of the basis of marking, and 
resulted in finding that, as is usually the case, great 
diversity of practice existed within the school. After 
discussion, however, it was agreed that marks should 
he based upon present native ability and accomplish- 
ment, in the consideration of which seven specified 
factors should be taken account of. The next question 
taken up was that of what a passing mark should mean. 
Again there was great diversity of opinion, and again 
some agreement was reached in that a tentative defini- 
tion of a passing mark was formulated. This was ap- 
plied to particular subjects and definite specifications 
for each worked out. The same was then done for the 
highest mark given. The next topic discussed was the 
matter of just what marking scale should be used, 
how many steps it should haye, and finally what the 
distribution of marks should be. The fact that there 
is much greater agreement between teachers when they 
attempt to give only a few marks than when exact per- 



THE MASKING SYSTEM AKD ITS MEANING 


133 


centile marks are used was brought out, and finally a 
five-letter system was adopted. The distribution of 
marks given by each teacher was plotted and a number 
of conferences held. Finally, a suggested though not 
mandatory upper limit on the per cent of failures was 
established. Apparently the net result of the series of 
meetings was a considerable amount of improvement 
in the marking system. 

3 . What marks should be employed? Especially in 
the elementary but also in the high school the gener- 
ally employed plan of marking in this country has been 
and still is the percentile system. In using it marks 
are generally reported to the nearest per cent and 
sometimes even in fractions of per cents. A compara- 
tively few elementary schools and a larger propor- 
tion of high schools use systems employing only a few, 
usually from four to seven, letters, figures, or other 
S 3 rmbols. There are a few cases of percentile systems 
in which the even fives or only the tens are used, but 
these are rare. The diversity of practice in the details 
of marking systems is very great. For example, a study 
( 63 ) made a few years ago by the writer shows that, 
among the 281 high schools in the state of Elinois 
which supplied information, there were ahnost 100 dif- 
ferent systems of marking in use. If such minor varia- 
tions as the use of E to denominate failure in some 
schools and of F for the same purpose in others, or 
the use of S, for superior, as the highest mark in some 
and of E, for excellent, as the highest in others, were 
disregarded, 28 marking systems remained which were 
different in that they represented marked differences 
either in the actual symbols used or in the systems 
themselves. Other investigations and reports show 
that the situation among these Illinois high schools is 



134 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

representative rather than unique. Such great diver- 
sity has many undesirable results and it would be a 
real gain if a single generally accepted marking sys- 
tem could be adopted. Apparently, however, this can- 
not be expected in the near future, at least. 

Tti the study just referred to it was found that ap- 
proximately three-fourths of the high schools included 
employed percentile systems and one-fourth systems 
in which a few letters or perhaps figures were used. 
Despite the fact that it is used by a minority of schools, 
a system of the latter sort is strongly recommended. 
Although the percentile system has the sanction of 
traditional and general usage and although many 
teachers, especially those who have been in service a 
considerable period of time, believe that they can give 
percentile marks which are highly accurate, it appears 
that the use of that system, at least in its usual form, 
can hardly be justified. Several of our leading educa- 
tional psychologists have expressed opinions, based 
upon both general knowledge of the situation and ex- 
perimental study, that even decidedly good teachers 
cannot distinguish more than a few degrees of ability 
or difference in achievement and the mass of evidence 
which has been gathered concerning the unreliability 
of teachers’ marks strongly supports this conclusion. 
The use of a percentile marking system therefore tends 
to give an illusion of accuracy which is not present. 
It is very doubtful if pupils who receive marks 1 or 
2 per cent, or even 5 per cent, above those of other 
pupils are really superior in the trait being marked. 
The giving of such marks' therefore frequently pro- 
duces a feeliag of undue elation on the part of some 
pupils and of undue discouragement on that of others. 
Differences which in many cases do not exist or are 



' THE MARKING SYSTEM AND ITS MEANING 135 

even just the reverse of what they appear are indi- 
cated. School honors and other awards given on the 
basis of marks are in many eases inequitably bestowed. 

The extreme extent to which the idea of false ac- 
curacy in determining percentile marks is carried by 
some teachers, especially those who have been in the 
profession a number of years, may be illustrated by 
the following incident which the writer happened to 
witness. A speaker at a teachers’ institute had made a 
not very forceful talk in which he urged the use of a 
marking system employing only a few letters. When 
he had completed his talk, the former county super- 
intendent, who had held that office during ten years 
up until a few weeks before the incident described and 
who enjoyed considerable influence and respect among 
the teachers present, arose to defend the percentile 
marking system. After making a number of points, he 
gave an illustration which he thought would clinch 
his side of the argument. He stated that the previous 
spring it was generally expected that one of two pu- 
pils in the county would make the highest average on 
the general, examination given to all eighth-grade 
graduates, and that therefore he marked the papers 
of these two pupils himself rather than turning them 
over to the regular scorers. After carefully marking 
these papers, each of which covered twelve subjects, 
he gave an average grade of 94^/4 per cent to one of the 
pupils and one of 94% per cent to the other. He said, 
'‘I did not attempt to cover up the difference or to 
avoid making a decision one way or the other by giv- 
ing both of them A or even both 94 per cent, but I 
studied their papers carefully until I found the dif- 
ference, even though it was only one-twelfth of 1 per 
cent. My conscience is dear on the matter since I 



136 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

did that." At once a general laugh broke out among 
those present, since they perceived the absurdity of 
any one being sure that he could detect one-twelfth of 
1 per cent in such a case. Furthermore, the general ef- 
fect of the ex-county superintendent’s illustration was 
to do much more to cause the teachers present to be- 
lieve in a letter system of marking than did anything 
said by the man who had talked in favor of such a sys- 
tem. 

There may be occasions on which relatively accurate 
percentile marks can be given on some particular test 
or piece of work. For example, if a spelling test of 
twenty-five words has been given, it is fairly satis- 
factory to count 4 per cent on each word and to give 
a mark on that basis even though the words are not 
of equal difficulty. On the whole, however, it is recom- 
mended that even in cases such as this a point score 
system be employed. In the instance cited, 25 points, 
one for each word, may be allowed and the scores re- 
^ ported to the pupils in terms of the number of words 
spelled correctly. It was stated in a preceding para- 
graph that several psychologists felt they had good 
grounds for believing that teachers could not distin- 
guish more than a few degrees of ability or difference 
in achievement, but no actual data were given to sup- 
port this position. Among the psychologists referred 
to is Starch (8o, pp. 10-11), who has attempted to 
determine the proper size of intervals in a marking 
system from some of the same data which he used to 
show the unreliability of ordinary marks. He made 
use of the principle that a symbol or unit on a scale 
should be large enough in its range to include 75 per 
cent of all the marks assigned a given paper. Apply- 
ing this principle to some of his data which show the 



THE MARKING SYSTEM AUTO ITS MEANING 137 

unreliability of marks given by the same teacher, he 
found that in terms of the ordinary percentile mark- 
ing system with passing at 75, the interval from one 
mark to another should be roughly 5 per cent. He 
states, “These are the smallest divisions that can be 
used with reasonable confidence by a teacher in grad- 
ing his own pupils.” In other words, if passing marks 
are thought of as ranging from 75 to 100, only six such 
marks should be used, or if from 70 to 100, seven. 
Furthermore, Starch applied the same principle to the 
marks given by different teachers and on this basis 
concluded that only three passing grades can be re- 
garded as fairly reliable. Probably the best practice is 
to employ a marking scale in which the number of sym- 
bols used is somewhere between the two figures just 
mentioned, that is, six or seven, and three. It should 
be noted, however, that there is nothing about the 
investigations of Starch or any one else along the same 
line which indicates that it is highly desirable to em- 
ploy just as many symbols as can possibly be con-, 
sidered at all reliable. 

In view of the preceding discussion, the use of a 
marking system of five or six symbols is recommended. 
Although the writer has a slight preference for the 
first letters of the alphabet, A, B, C, and so forth, he 
would not strongly urge their use instead of such ab- 
breviations as S for superior, G for good, M for me- 
dium, and so on, or a figure system such as 1, 2, 3, 4, 
and so forth, or any other set of symbols which is 
convenient to use and easy to understand. If five 
marks, such as A, B, 0, D, and E, are used, the occa- 
sional use of A-f- as a mark for very high achieve- 
ment and of C — to indicate those pupils who have 
just barely passed is recommended, also that both 



138 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

D and E be employed as failing marks. If six symbols 
are employed, four of them should be passing marks 
and it is doubtful if A+ and C — are needed, although 
there is perhaps no strong argument against their 
use. In any case, there should be two failing marks 
so that pupils who come fairly close to passing may 
be distinguished from those who fail badly. Many sys- 
tems employ other plus and minus signs than the two 
just mentioned, A-f- and C — , but on the whole this 
seems undesirable. If only three passing letters are 
used with plus and minus signs also, it results in a 
marking system having really nine passing symbols 
besides at least three failing ones, and therefore the 
marks given tend to become decidedly unreliable. How- 
ever, if a pupil has done such work that it is hard 
to determine whether he should receive a B or a C, 
for example, it may be desirable if he is given a B to 
add a minus sign to it to indicate to him that he is 
just barely over the line or if he is given a C to add 
.a plus sign to indicate that he is doing very strong 
C work and by just a little more effort may earn a 
B. Such use of plus and minus signs should, however, 
be limited to a comparatively small per cent of the 
marks given and furthermore should probably be only 
during the course of a semester and not in connection 
with the final mark entered upon the school records. 

One question which has received no attention so far 
except by implication .is that of whether or not there 
should be a mark for conditions, that is, for work of 
such a quality that the pupil doing it is neither given 
credit nor failed outright, but whether or not he re- 
ceives credit left to be determined by the quality of 
his work in the future. Such marks are not employed 
in many elementary schools, but they are fairly com- 



THE MARKING SYSTEM AND ITS MEANING 139 

mon among high schools. The 'writer, however, believes 
that no system of conditions, and therefore no mark 
for conditions, should be employed. Although it is pos- 
sible that there are unusual eases in which such a mark 
is the best way of dealing with the situation, it ap- 
pears that these few instances are much more than 
offset by the general tendency of a system of conditions 
td make pupils less anxious to secure passing marks. 
If pupils know that those who do not quite earn pass- 
ing marks but come close to doing so are to be condi- 
tioned rather than failed, and can remove the condi- 
tion by doing better work the succeeding semester, 
many of them will be entirely content to follow this 
procedure, whereas if the only hope of securing credit 
for a semester’s work is to make a clear-cut passing 
mark therein they will be stimulated to do so. In the 
few exceptional cases, such as those of pupils who have 
been absent a considerable amount of time, in which 
passing marks have not been earned and yet failing 
ones seem undeserved, the situation can be taken care • 
of by giving no marks at the time but waiting to do 
so until the pupils concerned have had an opportunity 
to make up the work and thus to give evidence of how 
well or poorly they have mastered it. 

4. Summary. Although it has been occasionally 
suggested that no marks other than satisfactory and 
unsatisfactory be employed, the arguments for so do- 
ing do not appear to be sufficiently valid to justify the 
practice. The use of marks undoubtedly does lead to 
certain undesirable results, but these may be greatly 
lessened and, even if not, should be decidedly overbal- 
anced by the favorable results. A study of the bases 
upon which teachers determine marks shows a great 
deal of variation both in the factors considered and 



140 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

in the relative weight given each. It is, therefore, de- 
sirable that some set of definitions or specifications be 
adopted to guide the teachers in a given school or 
system so that their marks may more nearly have the 
same meaning. Among the best of such sets of specifi- 
cations are those described by Masters, Reeder, and 
"Whitten, and the one in use in the University High 
School of the University of Ulinois. It is, however, 
recommended that no one of these or any other such 
scheme be bodily adopted by a group of teachers, but 
rather that they work out their own. Rugg has out- 
lined a very good general program for the reconstruc- 
tion of a marking system and the steps therein may 
with profit be followed by any school. The actual pro- 
cedure of the teachers in the Stamford, Connecticut, 
High School also furnishes a model. Although major- 
ity practice approves its use, the percentile system 
as ordinarily employed is not satisfactory, but should 
be replaced by one which employs comparatively few 
symbols, from three to five passing marks, two fail- 
ing marks and none for conditions. If an attempt is 
made to mark much more finely than this, the unre- 
liability of the marks given is liable to be decidedly 
great. 



CHAPTER VI 

THE DISTBIBUTION OF MAEKS 

I. Should marks follow the normal or any other 
fixed frequency distribution? One of the questions 
concerning which there has been considerable and 
often rather heated argument pro and con is that of 
whether or not the marks given by a teacher should 
approximate a predetermined frequency distribution. 
In most cases the normal distribution is the one sug- 
gested. Basing their arguments chiefly upon the appar- 
ent fact that most human traits and abilities are dis- 
tributed according to the normal frequency curve, many 
persons have urged that marks should conform to this 
curve, at least in all eases in whiuh fairly large numbers 
are concerned. Others admit that human traits or abili- 
ties tend to form normal distributions, but maintain 
that most actual distributions of marks should be modi- 
fied to fit known or assumed differences between the 
partictilar groups of pupils to which they are assigned 
and an average or random group. The adherents of 
this view have in some eases suggested definite skew 
distributions,^ that is, distributions which are not 
normal or symmetrical, and in other cases have laid 
down principles or given general suggestions by which 
the desired distribution can be determined in each case 

1 A Mhew diatrihution xnay be thougbt of as a normal distribution 'ivhicb has 
been pulled or pushed in one direction or the other so that the extreme cases at 
one end are considerably further from the average than are those at the other end. 

141 



142 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

rather than to urge the use of any given distribution. 
Still others have denied the desirability of attempting 
to make marks fit any predetermined or general dis- 
tribution, either normal or otherwise, maintaining that 
each teacher’s judgment is the best guide in assign- 
ing them. 

There are certain general arguments which may be 
cited in favor of approximate conformity to some 
standard or ideal distribution of marks regardless of 
whether it be the normal or some other. Most of these 
arguments, however, will be found upon analysis to 
be more or less synonymous, the chief point made be- 
ing that teachers’ marks as commonly given are sub- 
jective, unreliable, and inaccurate, that those given 
by one teacher do not have the same meaning as those 
of another, and that the use of a standard distribu- 
tion is the best means of remedying the situation. 
Some teachers are high markers, others low markers, 
so that the mark any pupil receives is largely a mat- 
ter of chance as to what teacher he has. Moreover, of 
a number of teachers whose average marks are the 
same or approximately so, some are likely to give large 
proportions of both high and low marks and compara- 
tively few average ones, whereas others probably 
bunch their marks much more closely around the aver- 
age. It has been argued rather strongly that adopting 
a standard distribution to be followed is the most ef- 
fective single means of improving the situation, that 
is, of securing at least a degree of desirable uniform- 
ity in the meaning of marks given by different teachers, 
or, for that matter, in those given by the same teacher 
at different times. 

It is not in accord with the plan of this book to at- 
tempt to present an entirely exhaustive list or con- 



THE DISTRIBUTION OE MARKS 


143 


sideration of the arguments advanced in favor of and 
against the use of a standard distribution in assigning 
school marks. However, several pages will be devoted 
to mentioning and discussing briefly the chief argu- 
ments on both sides, and especially to refuting some 
of the unfavorable ones. In a number of cases it will 
be shown that the adverse arguments lose much or per- 
haps even all of their force if a standard or model 
distribution is applied in a reasonable and sensible 
way. In other words, they are valid only in so far as 
they are directed against extreme or over-mechanical 
use of such a distribution. 

The argument most often advanced against assign- 
ing marks according to a fixed or predetermined dis- 
tribution is that it is very doubtful, indeed perhaps 
improbable, that most single classes or groups of pu- 
pils handled by teachers actually constitute random or 
near-random samples of all pupils of the same type. 
In other words, many of the opponents of such a plan 
of assigning marks admit that it might apply with a 
fair degree of justice to large groups of pupils con- 
taining hundreds of individuals, but they maintain that 
it should not be applied to any one ordinary class. 
They claim that although some classes are average, 
others are distinctly above average, and others dis- 
tinctly below; that some are comparatively homoge- 
neous, others comparatively heterogeneous; that the 
teacher rarely knows which condition holds for any 
particular group. Therefore, they assert, the distribu- 
tions of marks given in different classes should differ 
considerably. Eeplying to this argument, most of those 
who favor the adoption of more or less fixed distribu- 
tions agree that they should not be applied rigidly, 
if at aU, to small groups. Thus this vital point of dis- 



144 TRADITIONAL EXAIkONATIONS AND NEW-TYPE TESTS 


agreement between proponents and opponents of the 
plan largely, if not entirely, simmers down to a matter 
of tlie size of the group to which it should be applied. 
This point will be considered later in discussing the 
use of the plan. In addition, however, some of the pro- 
ponents answer the unfavorable arguments in part by 
asserting that if a class is of more than average abil- 
ity, it should be held up to a higher than average stand- 
ard of work, whereas if it is below average in ability 
the standard applied should not be as high as that for 
most pupils and therefore that practically the same dis- 
tribution of grades should be given in one class as in 
another. Furthermore, it is maintained with a consider- 
able degree of truth that teachers tend to exaggerate 
differences in classes and that unless some plan of 
grouping based upon ability has been employed the 
chances are large that an elementary or high-school 
class of more than a very few pupils will not differ 
significantly from the average. 

It is sometimes said that the present system is satis- 
factory. The answer to this statement is that many, 
probably most, competent persons acquainted with 
present conditions do not believe that it is. This opin- 
ion appears to be well supported by a considerable 
body of evidence, much of it objective, to which some- 
what more explicit attention has been given elsewhere 
in this book. This evidence indicates that teachers not 
only differ widely in the marks assigned to particular 
papers, but also in those given for the work of a whole 
semester or year. 

A third objection is that the proposed plan of dis- 
tributing marks is too mechanical. It cannot be denied 
that there is a certain more or less mechanical element 
in its application, but it should be recognized that be- 



THE DISTRIBUTION OP MARKS 


145 


cause a procedure is mechanical it is not necessarily 
undesirable. Certainly it is true that many activities 
of school teachers should not be routinized, but on the 
other hand it makes for efficiency to have more or less 
mechanical methods and procedures for use in cer- 
tain situations. No objection is made by any one to 
what might be called a mechanical method of taking 
attendance, for example, or of collecting papers, but 
on the contrary it is recognized that such a method 
saves the teacher's time, energy, and thought for ex- 
penditure along more profitable lines and furthermore 
that it frequently insures that the attendance is actu- 
ally taken or the papers collected. In a similar manner 
some advantage may be claimed for employing a sys- 
tem of assigning marks in which there is a more or 
less mechanical or fixed element. Moreover, even if a 
fixed distribution of marks is followed with extreme 
and undesirable rigidity so doing does not at all de- 
termine which pupils will receive each of the various 
marks given, but leaves this to the judgment of the 
teacher. 

One argument against the use of the normal curve 
of distribution for the marks given by teachers is that 
the influence of the incentives present in school work 
is such as to result in a distribution of achievements 
which is not normal. Those who believe thus are some- 
times willing to assume for the sake of argument that 
the distribution of ability of a particular class is likely 
to be near normal, but maintain that performance is 
affected by many factors, and that among these incen- 
tives or motives are prominent. The particular incen- 
tives which probably have the greatest effects in this 
connection are those having to do with reaching cer- 
tain definite marks. Pupils who are not falling very 



146 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

far short of passing frequently put forth extra efforts, 
and, in many cases, raise their marks to passing or 
just above, whereas other pupils who could do better 
work are content to earn barely passing marks. There- 
fore if marks are given in accordance with actual 
achievement rather than with ability, an accumulation 
of .marks at this.point results. The same is true, though 
probably to a lesser degree, of other points on the 
marking scale. For example, if exemptions from ex- 
aminations are allowed there will usually be a bunch- 
ing of marks around the exemption point. It is there- 
fore urged that to distribute marks in accordance with 
a normal distribution is to falsify the situation and 
neglect the effect of these very real factors in achieve- 
ment. There is no doubt that there is some truth in the 
contentions just stated, and this is one of the reasons 
why it is not desirable to follow a fixed distribution of 
marks, especially in case the fixed distribution is a 
normal one, with a high degree of exactness. In the 
cases of different groups which have practically the 
same incentives and motives, however, it may be ex- 
pected that the marks of such groups will form dis- 
tributions similar enough that a standard may be 
adopted for most or even all such groups of pupils. 

It is also charged that by following a set distribu- 
tion a teacher is deprived of the freedom she should 
have in marking. It is true that such a requirement 
does limit teachers’ freedom, but the same charge could 
be brought against almost any supervisory or admin- 
istrative regulation enforced in school. A teacher who 
may prefer a school day lasting from 8:00 a. m. to 
2 : 00 p. M. is required to conform if the system wherein 
she teaches determines that its day shall be from 9 : 00 
A. M. to 3 : 00 p. M., but she rarely complains that her 



THE DISTRIBUTION OE MARKS 


147 


personal freedom is unduly restricted. Similarly, one 
who wishes to use a certain system of marking, such 
as the letter system, must comply with the practice 
of the school which employs her if some other system 
is in use. In other words, the mere fact that some 
teachers are deprived of a degree of freedom is no 
argument against a particular marking system or any 
other plan provided it makes for increased efficiency 
in school and does not deprive the teacher of so much 
or so vital a part of her freedom as to injure her. 
Furthermore, it may be said that if some teachers are 
to be permitted to teach at all, they certainly should 
be deprived of the freedom they wish to exercise in 
giving pupils marks very much Mgher or lower, or in 
some other way very different, from those of most 
teachers. 

It is sometimes said that the use of a predetermined 
distribution requires that certain good students be 
failed or poor students passed. It is true that it may 
do so if administered unintelligently, but because it 
is subject to abuse if improperly used is no justifica- 
tion for condemning the entire plan. Moreover, it is 
largely a matter of subjective judgment as to whether 
or not pupils are rated superior, average, inferior or 
otherwise. Those whom one teacher rates as good 
enough to be passed or even decidedly higher than 
passing, another might on the same evidence rate as 
failures, and vice versa. 

The statement is frequently made that certain sub- 
jects, courses or teachers tend to draw groups of pu- 
pils superior or inferior to the average, or that cer- 
tain subjects or courses are inherently more or less 
difficult than others, and that therefore it should not 
be expected that the same distribution of grades will 



148 TRADITIOITAL EXAMINATIONS AND NEW-TITPE TESTS 

be followed in all. TMs condition undoubtedly exists 
in many schools, but is one to be abolished rather than 
favored. Courses should not be so organized that earn- 
ing a unit of credit in one subject is markedly easier 
or harder for pupils in general than is earning one 
in some other subject. For example, if, as is sometimes 
said, Latin is a harder subject than English, or arith- 
metic a harder subject than geography, the amounts 
of work required should be so proportioned that for 
an average pupil to do passing work in one subject 
during a semester requires approximately the same 
amount of time as to do passing work in the other. 

It has also been argued that pupils in rather small 
classes usually do better work than those in large 
classes, that the best teachers inspire or otherwise 
cause their pupils to do better work than do those 
of most teachers, and that therefore pupils in small 
classes or under superior teachers should receive 
higher average marks than those in large classes or 
under inefficient teachers. In reply, the apparent re- 
sults of a number of investigations dealing with class 
size have, on the whole, indicated that within reason- 
able limits small classes are little if any more efficient 
than large ones. Even if they are, and if, as cannot 
be denied, certain teachers are more efficient than 
others, the question may be raised as to whether it is 
fair that certain pupils should receive higher marks 
than others merely because they happen to be in 
smaller classes or under better teachers, whereas 
others happen to be in larger classes or under poorer 
teachers. 

Another argument advanced is that the more ma- 
ture pupils become the higher should be the general 
trend of the marks they receive. This contention is 



THE DISTBIBimON OP MARKS 


149 


based upon the undoubted fact that a selective proc- 
ess is gradually taking place tbrougbout elementary 
and high school, and that it is usually decidedly marked 
in such places as the first elementary grade and the 
freshman year of high school. It is a debatable ques- 
tion, however, whether or not this tendency justifies 
higher marks from year to year, or, on the other hand, 
whether the standard of marking should not be raised 
and the distribution of marks given remain approxi- 
mately the same. In any case, the adoption of a fixed 
distribution as a standard does not necessitate that 
the same distribution be adopted for use in all grades 
and classes and therefore the argument just presented 
cannot justly be advanced against all use of a stand- 
ard distribution. 

A similar objection is sometimes raised regarding 
the marks assigned pupils in classes or groups which 
have been sectioned on the basis of ability. It may 
be answered in the same general manner. That is, if 
one believes that standards should be relative to the 
ability of a particular group of pupils rather than 
absolute, the same general distribution should be fol- 
lowed for groups of different abilities. If one believes 
just the opposite, several standard distributions may 
be adopted, there being as many as the different de- 
grees of ability supposed to be represented by the 
different groups. Thus if pupils are ^vided into supe- 
rior, average, and inferior groups or sections, there 
will be three standard distributions of marks, one for 
each type of group. 

. 2. Suggested practices concerning the use of a stand- 
ard distribution of marks. In the first place the writer 
wishes to outline his general belief as to the use of a 
standard distribution of marks. He agrees with those 



150 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

who maintain that there should be such a distribution 
adopted in each school or school system, ordinarily the 
latter, which should serve as a general guide to 
teachers in assigning marks. In some cases it may be 
desirable to adopt several different standard distribu- 
tions for use with different groups of pupils or to 
allow different degrees of freedom in departing from 
the one adopted. In the case of all elementary and 
high-school classes not selected or grouped on the basis 
of ability the normal distribution is the one which 
should be followed. For selected classes, a distribution 
somewhat skew in the direction of selection, upwards 
for a superior group and downwards for an inferior 
one, should probably be substituted for the normal 
distribution. In any case, the distribution should not 
be followed rigidly and without thought. The degree 
of allowable or justifiable departure from a standard 
distribution is greater in the case of small than of 
large classes. The general trend of marks given by a 
teacher to several hundred pupils should approximate 
the ideal curve rather closely, unless most of her 
classes or sections are composed of pupils selected on 
the basis of ability. There may, however, be valid rea- 
sons why in a particular class there should be consid- 
erable departure therefrom. A desirable device to use 
in connection with the standard distribution or even 
in place of it is to employ another distribution ex- 
pressed not in terms of definite proportions or per 
cents of marks to be given, but as limits between which 
the proportions or per cents should fall. It should then 
be required that no individual teacher give any par- 
ticular class a distribution of marks falling outside 
of these limits without having conferred previously 
with her principal or other superior and presented sat- 



THE DISTKIBUnON OF MARKS 


151 


isfaotory reasons for so doing. The question of just 
what the ideal distribution and the limits should be 
will be taken up in. the following paragraphs. 

What is probably the most frequent abuse of the 
principle that teachers should be guided by a normal 
or some other standard distribution of marks has al- 
ready been hinted at. This is its too rigid application, 
especially in the case of small groups of pupils. One 
extreme example of misapplying the principle through 
observing the exact letter of the law rather than its 
spirit was related to the writer some years ago. This 
may sound as though it were made up for the sake 
of illustration, but the incident has been reliably 
vouched for as having actually occurred in a certain 
college. One of the members of the faculty thereof had 
let it be known that he followed the normal curve very 
closely in assigning marks and also announced the per 
cents of each letter which he gave. At the begiiming 
of one semester several of the students in a rather 
small class of his noticed that according to the per 
cent of failing marks which the instructor regularly 
gave just one individual in the class would fail. These 
students were concerned with getting credit out of the 
course with as little work as possible ; therefore each 
contributed a few dollars and with the amount thus 
raised hired another student at the institution to en- 
roll in the class and do such outstandingly poor work 
as to make him sure to be the one failed. The scheme 
resulted as hoped, and this one student alone was 
failed although almost half of the class did such poor 
work that they distinctly deserved failure. 

In another situation of somewhat the same nature 
the outcome was different and less striking. In this 
case also the class was small and the students were 



,152 TRADITIONAL EXAMHTATIONS AND NEW-TYPE TESTS 


reasonably sure that the instructor would give only 
one A. The three or four best students in the class hap- 
pened to be talking about the matter early in the se- 
mester and in the course of the conversation one of 
them made a proposal which was adopted and followed. 
His suggestion was to the effect that since no matter 
how hard they worked only one of them would receive 
an A, and since he thought they were of so nearly equal 
ability that if all worked equally hard it would be 
largely a matter of chance which one received it, they 
do work merely good enough to insure that they would 
be rated somewhat better than the other members of 
the class and not compete with one another as to who 
should receive the A. The result was that at the end 
of the semester no A’s were given in the class. The 
instructor stated at the last meeting that he would 
violate his usual practice because he believed that no 
one had earned an A, since he felt sure that the best 
students had loafed on the job and not put forth any- 
thing approaching their best efforts. 

Many suggestions have been made as to just what 
distribution of marks should be followed. The distribu- 
tions recommended have in most cases been normal, 
though in some they have been symmetrical but not 
normal, and in some not even symmetrical but skew. 
Furthermore, they have most often been planned on 
the assumption that a five-letter or symbol system of 
marking is to be used, though a few do not make this 
assumption. About twenty of these different suggested 
distributions of five symbols are given in the accom- 
panying table. The first part of the table contains a 
number of such symmetrical distributions and the sec- 
ond part three skew distributions. These have been 
gathered from various sources and probably include 



THE DISTEIBUTIOH OE MARKS 


153 


TABLE III. SUGGESTED PERCENTILE DISTRIBUTIONS OP 
lilARKS POR A PIVE-SYMBOL MARKING SYSTEM 


PART I. SYMMETRICAL DISTRIBUTIONS 


A 

B 

C 

D 

B 

2 

23 

50 

23 

2 

3 

23 

48 

23 

3 

3 

22 

50 

22 

3 

3% 

24 

45 

24 


4 

24 

44 

24 

4 

4 

21 

50 

21 

4 

5 

20 

50 

20 

5 

5 

24 

42 

24 

5 

6 

24 

40 

24 

6 

7 

21 

44 

21 

7 

7 

22 

42 

22 

7 

7 

24 

38 

24 

7 

8 

24 

36 

24 

8 

10 

15 

50 

15 

10 

10 

20 

40 

20 

10 

10 

24 

32 

24 

10 

15 

22 

26 

22 

15 


PART n. 

SKEW DISTRIBUTIONS* 


A 

B 

c 

D 

B 

2 

18 

50 

24 

6 

3 

21 

45 

19 

12 

14 

44 

33 

6% 

2% 


* The skew distributions given above are those which have been advocated 
by various persons. A number of others have been suggested as possible or as 
illustrative of what might be done. (See 77, p. 309.) 

almost all of the plans of this sort which have received 
any considerable amount of attention as well as sev- 
eral which have not. It wDl be noticed that there are 
very considerable differences between the suggested 
distributions. Some are based on the belief that as 
many as 50 per cent of the marks given should be the 
same, or, in other words, that as many as 50 per cent 
of the pupils should be rated as average, whereas 
others are based on the belief that less than one-third 
of all pupils should be so rated. Similarly, some dis- 
tributions would place from five to seven or eight times 
as many as would others in the extreme groups. The 



154 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

differences between the various suggested symmetrical 
distributions are due chiefly to the fact that the normal 
curve has been divided at different points in order to 
produce five proportions or per cents of marks.^ 

Many of those who have proposed the distributions 
given above have not stated definitely where the fail- 
ing point should come, but presumably they agree with 
practically all those who have done so to the effect that 
in each case the four highest symbols should be pass- 
ing marks and the lowest one a failing mark. Thus the 
recommended per cents of failures differ greatly and 
indicate very diverse opinions as to standards of pass- 
ing work. In one or two cases it is explicitly stated that 
the lowest mark should include both failures and con- 
ditions. 

The suggestion has been made that a normal or 
other fixed distribution .of marks be applied only to 
those pupils who are given passing marks, the per 
cent of failures being determined independently of any 
such distribution. At one institution, for example, the 
following per cents of passing marks appear to have 
been taken as standard: A, 10 per cent; B, 30 per cent; 
0, 40 per cent; D, 20 per cent. A range of several points 
about each is allowed. No satisfactory reason is ap- 
parent why a standard distribution, if used at all, 
should not apply to all marks rather than to passing 
ones only. 

2 The statement above refers to a rather technical matter vrhich need not be 
explained in full here. Suffice it to say that the normal probability curve never 
actually touches the base line above 'which it is constructed, and that it is there- 
fore necessary to select some arbitrary point at 'which it will be cut off or, in 
other words, considered as having reached the base line. The farther this point 
is away from the origin or center of the curve, the smaller will be the per cents 
of high and low marks and the greater that of average marks if, as is usually done, 
the portion of the curve used is divided into sections having equal bases. For 
a fuller discussion of this matter readers should see a text on educational 
statistics. 



THE DISTRIBUTION OF MARKS 153 

It is not contended here that any one distribution of 
marks is decidedly or always better than any other. 
Those which are extreme in either direction, that is, 
very large in the per cents of high and low marks, or 
in those of average marks, should be avoided. In a 
five-symbol system 5 per cent in each of the extreme 
groups is probably too small and 50 per cent in the 
average group too large. On the whole the 7, 24, 38, 
24, 7 distribution is perhaps the best though the 10, 
20, 40, 20, 10 one is also to be recommended and four 
or five more of those in the same portion of the table 
do not differ from these enough to be much inferior. 

TABLE IV. SUGGESTED PERCENTILE DISTRIBUTIONS OF 
MAHKS FOR OTHER THAN FIVE-STMBOL MARKING 
SYSTEMS 


PART I. 

THREE-SYMBOL 

SYSTEMS 

A 

B 

c 

16 

68 

16 

20 

60 

20 

25 

50 

25 


PART II, POXJB-STMBOI. SYSTEMS 


A 

B 

C 

D 


7 

43 

43 

7 


10 

40 

40 

10 


11 

39 

39 

11 


15 

35 

35 

15 


PART 

in. SIX-SYMBOL SYSTEMS 


B 

C 

D 

E 

F 

12 

37 

37 

12 

1 

14 

34 

34 

14- 

2 

16 

30 

30 

16 

4 

15 

30 

30 

15 

5 


PART IV. SEVEN-SYMBOL SYSTEMS 


B 

C 

D 

E 

F 

6 

24 

38 

24 

6 

8 

23 

34 

23 

8 

10 

22 

30 

22 

10 

10 

22 

28 

22 

10 

10 

20 

30 

20 

10 


G 

1 

2 

3 

4 

5 



156 TRAPITIONAL EXAMIUTATIONS AND NEW-TYPE TESTS 

In Table IV will be found a nranber of suggested dis- 
tributions of marks for systems employing three, four, 
six and seven symbols. These also exhibit considerable 
variation though in the case of the first two at least it 
is not as great as in the case of the five-symbol sys- 
tems. In these cases also it does not appear that any 
one recommended distribution can be definitely stated 
to be better than the others. Although none of the dis- 
tributions given in this table is here recommended for 
use, the following are named as perhaps the best : for 
three symbols the second one, with per cents of 20, 
60, and 20; for four, also the second, with 10, 40, 40, 
and 10 per cent ; for six, the third, with 4, 16, 30, 30, 
16, and 4; and for seven, also the third, with 3, 10, 22, 
30, 22, 10, and 3 per cent. 

As has been stated above a statement of limits 
within which a teacher’s actual distribution of marks 
should be expected to fall is a very desirable addition 
to or even substitute for a single standard distribu- 
tion. For a five-symbol system with four passing marks 
and one failing one, the following limits are recom- 
mended: A’s, 5 to 15 per-cent; B’s, 15 to 30 per cent; 
C’s, 25 to 50 per cent; D’s, 15 to 30 per cent; and B’s, 
5 to 15 per cent. For the systems having 3, 4, 6, and 
7 symbols, such limits will not be given in detail, but 
in general those for any particular mark should vary 
from about half of the recommended per cent of that 
mark to about one and one-half times the same per 
cent. 

Although most plans of marking which make use 
of only a few symbols have one failing mark and all 
the others passing, a few have been suggested in which 
there are two failing marks or one failing mark and 
one mark for conditions. For example, sometimes A, 



THE DISTRIBUTION OP MARTTR 


157 


B, and C are used as passing marks, D for conditions, 
and E for failures, or D and E both as failing marks, 
D being given to pupils who come fairly close to pass- 
ing and E to those who do not. In most schemes employ- 
ing as many as seven symbols only five are definitely 
passing. In general it is true of the six and seven sym- 
bol systems that, if only the last mark is used for fail- 
ure, too few pupils are failed, and if the last two marks 
are so employed, too many are failed. 

It will be recalled that in the previous chapter the 
use of a five-letter system, two of the letters being 
failing marks, was recommended. Also the suggestion 
was made that A-1- be given to each of a decidedly 
limited number of the very best pupils and C — to each 
of those just barely passing. It is here suggested that 
the standard distribution of marks for such a scheme 
should include five symbols but be based upon a four- 
division scheme with the two failing marks combined 
equivalent to the lowest division. Therefore, 10 per 
cent of A’s, 40 per cent of B’s, as many C’s, and 10 
of D’s and E’s combined may be taken as the ideal 
distribution. Probably not more than two out of each 
10 A’s should have the plus sign added, and about the 
same number of C’s be followed by the minus sign. 
The 10 per cent of failures may well be fairly evenly 
divided between D’s and E’s. As limits for such a 
plan, the following are suggested: A’s, 5 to 20 per 
cent; B’s, 25 to 50 per cent; C’s, 25 to 50 per cent; D’s 
and E’s combined, 5 to 20 per cent. For a class de- 
cidedly under average size these limits should be 
changed so as not to require that any A’s or any fail- 
ing grades be given, though frequently there should 
be some of one or both in even a very small class. 

A rather unique suggestion as to the application of 



158 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

the normal curve to the distribution of marks has been 
made by Blackliurst (4) in an article dealing with high- 
sehool and college marking. The substance of his sug- 
gestions is that instead of using the whole normal 
curve, which he implies is desirable for elementary- 
school marks, the upper half thereof be used in assign- 
ing marks in high school and college. He states that we 
know, or at least assume that we know, that high- 
school and college students constitute a selected group 
and further that, although we do not know just what 
portion of the whole population composes this group, 
it is a fair assumption that it is the upper half. After 
discussing the question of whether or not this upper 
half redistributes itself to form a new normal distri- 
bution or remains similar to the upper half of a com- 
plete normal distribution, he concludes that the latter 
is nearer the truth. Therefore he recommends a distri- 
bution of marks similar to half of the normal curve. 
For example, if the standard distribution for elemen- 
tary marks is 10, 20, 40, 20, and 10 per cent, he would 
take one-half of this, that is, 10, 20, and 20 (14 of 4D) 
and multiply this by two to get a total of 100 per cent, 
thus giving per cents of 20, 40, and 40. To give another 
example, if the recommended per cents for elementary 
marks are 2, 7, 16, 50, 16, 7, and 2, those for high-school 
and college marks become 4, 14, 32, and 50. 

The thoughtful reader will see that Blackhurst’s 
plan merely carries further the suggestion already 
made that for groups above or below normal or aver- 
age, distributions skewed in one direction or the other, 
as the ease may be, may be used. It is doubtful if 
Blackhurst’s assumption that a high-school or college 
group forms a distribution rather closely resembling 
the upper half of a normal curve is correct. From 



THE DISTRIBUTION OF MAR-K-R 


159 


all the evidence at hand it seems more likely that the 
curve formed is best described as a skew curve, that 
is, a normal curve which has been modified by having 
some, but not nearly all, of the lower half taken away 
so that it is somewhat steeper on the lower side than 
on the other. 

3. Assigning marks to pupils in selected or non-aver- 
age groups. An important question often raised in 
connection with adopting and using an ideal or stand- 
ard distribution of marks, and sometimes also in other 
connections, is that of how much the marks given in 
groups selected on the basis of ability or for some 
other reason non-average should depart from the 
standard distribution for unselected pupils. Some per- 
sons argue that the difference in marks should be as 
great as the difference in ability between the particu- 
lar group concerned and an average group. Therefore, 
if pupils are classified into superior, average and in- 
ferior groups and if a five-letter system is used. A, 
B, and C being passing and D and E failing, they 
would give those in the superior groups A’s and B’s 
only, those in the average groups B’s and C’s only, 
and those in the inferior groups C’s, B’s and E’s. 
Others maintain that in order to have the proper 
standards of work and to stimulate the pupils in each 
group to their best efforts marks should be assigned 
in each- group according to the ability of that group 
alone and therefore should range all the way from A 
to E in each group. 

Each of the two plans of marking just outlined has 
one chief advantage or supporting argument. That 
of the first is that a given mark usually is supposed 
to have the same meaning regardless of the group of 
pupils in which given, so that an A, for example, should 



ICO TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

indicate about the same degree of aeliievement regard- 
less of classification of pupils or other conditions. On 
the other hand, the second plan adapts the standards 
of marking to the ability of the group concerned, or, 
in other words, measures their achievement by com- 
paring it with their own ability and thus provides the 
best stimulation and motivation. There appears to be 
some merit in the contention of each side and the most 
satisfactory procedure is probably a compromise be- 
tween the two. It is therefore recommended that there 
be a somewhat severer standard for a more able than 
for a less able group, but not to the extent that one 
is justified in failing pupils in superior sections, for 
example, when they are doing a much better quality 
of work than pupils in inferior sections who are being 
passed. On the other hand, unless the superior pupils 
are held to somewhat higher standards they will prob- 
ably develop habits of mental slothfulness and lazi- 
ness which should if possible be avoided. 

In view of these reasons, the plan of distributing 
marks which is recommened for use when pupils are 
grouped on the basis of ability is in accord with the 
suggestion in the second paragraph above, that marks 
in superior sections be liniited to A’s and B’s, those 
in average to B’s and C’s and those in inferior sec- 
tions to C’s and failing marks. It is not, however, ad- 
vised that the same letter have exactly the same sig- 
nificance in the different sections. There should be 
enough difference in the standards of work and mark- 
ing in groups on the three levels that a pupil barely 
earning a B, for example, in an average group would 
ordinarily be unable to earn more than a C in a supe- 
rior group. Likewise a pupil in an average group who 
is just barely falling below passing should generally 



THE DISTRIBUTION OP MARKS 


161 


be able to earn a C in an inferior group, and so forth. 
This plan would in no way prevent pupils placed in 
the wrong sections from receiving the marks they de- 
serve because there should be immediate transfer to 
other sections if their work is too good or too poor 
for the ones they are in. Thus if the work of pupils in 
superior groups becomes too poor for B’s they should 
be dropped into average sections. Similarly if pupils 
in average sections do work deserving A’s or probably 
even very strong B’s there they should be placed in 
superior sections, whereas if their work is failing and 
deserves D’s or E’s their proper place is in inferior 
sections. Likewise pupils in inferior groups doing 
strong C work or better should ordinarily be placed 
in average groups. Such a basis of assigning marks 
as has just been outlined results both in marks each 
one of which is reasonably constant in the amount of 
achievement it indicates, and also in standards some- 
what adapted to the particular groups of pupils to 
which they are applied. 

There is also another standpoint from which the 
matter may be considered and concerning which there 
are diametrically opposed viewpoints. If one assumes 
that the marks given pupils represent largely their 
ability to do further or more advanced work in the 
same subjects, it may be argued that the bright but 
indolent pupils who have done very poorly should be 
marked higher than the dull but industrious ones who 
have done approximately their best, because they, the 
bright pupils, undoubtedly have the possibilities of 
doing much better work in the future. On the other 
hand, if the marks given are thought of as being 
largely measures of pupils * efforts, application and 
attitudes, * dull pupils • who try hard should receive 



162 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


Mglier marks. In this dilemma also the recommended 
solntion is a compromise. It is generally undesirable 
to hold hack a pupil if the teacher feels sure that he will 
do the immediately succeeding work satisfactorily. It 
is, however, also undesirable to encourage a bright 
but lazy or uninterested pupil in his unsatisfactory 
habits and attitudes by passing him merely because of 
what he can do if he will. The teacher must therefore 
judge each individual ease and decide which of these 
two viewpoints fits it best and then act upon that one. 
Likewise in the ease of inferior pupils very near the 
passing point who are putting forth considerable ef- 
fort the teacher must decide whether the danger from 
putting them ahead into work which they cannot han- 
dle satisfactorily or of discouraging them by holding 
them back despite their application and industry is the 
greater and then of course choose the other alterna- 
tive. What has just been said in the last two or three 
sentences may be criticized as too hazy and indefinite 
by some strong advocates of making marks as objec- 
tive as possible, but the writer, though a strong be- 
liever in objective measurements and marks, believes 
that only by doing as suggested can justice be done to 
individual pupils. Indeed he is willing to go so far as 
to say that in the case of a dull but industrious pupil 
who is completing all of a given subject that he intends 
to carry it is well to be somewhat more lenient in giv- 
ing marks than if the pupil were expecting to continue 
the same subject. For example, if such a pupil has 
completed two years of high-school Latin and has no 
intention of taking more it is better to pass him even 
though his work be slightly below passing. On the 
other hand, if he were expecting to continue his high- 
school Latin it would not be desirable to do so, since 



THE DISTRIBUTION OF MARTCj :! 163 

it would result in trouble for both him and his next 
teacher. 

Some of those who believe that the marks given in 
groups constituted on the basis of ability should form 
different distributions have suggested plans of deter- 
mining the standard distribution for each given group 
of pupils. The one of these which has perhaps been 
most often mentioned involves the determination of 
the intelligence of the group concerned. It is proposed 
that the pupils’ scores upon an intelligence test, per- 
haps supplemented by other data, be compared with 
those of an average group of pupils of the same grade 
status. The standard distribution of marks for a par- 
ticular group is then as much above or below that for 
all pupils as the distribution of intelligence test scores 
for the particular group is above the distribution for 
all. In general this proposal does not change the char- 
acter or shape of the distribution of marks given but 
merely shifts all up or down. The plan assumes that 
marks are given on an absolute basis in the sense that 
each denotes the same amount of achievement with- 
out regard to the ability or effort expended. Such a 
procedure as this is not entirely out of harmony with 
that just recommended, but appears to go too far in 
making marks measures of absolute achievement and 
nothing else. If intelligence test scores or other ratings 
which might serve the same purpose are available they 
should be taken into consideration in the assignment 
of marks, but not serve to raise or lower the total dis- 
tribution thereof as much as they are above or below 
those of an average group of pupils. 

This general, question of assigning marks to groups 
of different abilities arises not merely when there are 
several differentiated or selected groups in the same 



164 TRADITIONAL EXAMNATIONS AND NEW-TYPE TESTS 

grade or class, but also as between a lower grade or 
class in wbieh there has been a relatively small amount 
of selection, and a higher one in which more selection 
has taken place. For example, it is commonly the prac- 
tice to fail many more high-school freshmen and to 
give fewer high marks to members of that class than 
is the case with seniors. The question arises whether 
or not this is justifiable. The best answer to this ques- 
tion appears to be of the same sort as to the closely 
related one treated in the last few paragraphs, that a 
compromise is best. It is probably justifiable to fail 
more first-grade than eighth-grade pupils, for exam- 
ple ; also more high-school freshmen than seniors. The 
chief reason is that it is a function, though undoubt- 
edly not the chief one, of the school to select those who 
can profit by further work. Especially in those grades 
where children begin a type of work new to them, the 
pupils who are unable to do the work satisfactorily 
should be discovered and either prevented from ad- 
vancing, better prepared for doing so, or transferred 
to kinds of work better adapted to their abilities and 
interests. On the other hand, eighth-grade pupils as 
compared with those in the first grade and high-school 
seniors as compared with freshmen should be held up 
to somewhat higher standards of work and therefore 
should be marked somewhat more rigorously. 

4. Adjusting the marks of teachers who do not con- 
form to the standard. It has sometimes been urged that 
the best way to secure satisfactory distributions of 
marks is not to adopt and require teachers to follow a 
standard distribution, but rather to shift the marks 
assigned by some systematic plan so that they will ap- 
proximate the desired standard. In other words, 
teachers should report the marks of their pupils to the 



THE DISTRIBUTION OF MARKS 165 

principal, superintendent or other official and under his 
direction they should be transmuted so that they come 
as near as possible to the desired distribution before 
being given out to pupils or entered upon the school 
records. Various plans and methods of accomplishing 
this by adjusting the marks of teachers who mark 
too high or too low have been proposed. A number of 
these are impracticable for regular use because the 
amount of labor required is too great, but several of 
those which do not require excessive work will be men- 
tioned. 

Probably the simplest plan suggested is that the 
marks of each teacher be tabulated to form a distribu- 
tion, and the whole distribution merely moved up or 
down the proper number of points, that is, all of the 
marks raised or lowered by an absolute amount. An- 
other somewhat similar proposal is that the marks be 
multiplied by a constant amount which would in the 
case of each teacher be determined by the ratio be- 
tween her average mark and that of all the teachers 
or perhaps the mark which has been selected as the 
ideal or standard average. The first of these is fairly 
simple, but is open to the serious objection that if a 
large enough increment to raise the lower marks suf- 
ficiently is added it may result in marks of above 100 
if a percentile system is employed or above the high- 
est symbol in any other. The second is slightly more 
difficult to use and likewise may result in marks above 
the possible upper limit. It also appears to increase 
the higher marks too much in comparison with the 
lower ones. A third plan is to raise the original marks 
by a certain fraction of the difference between each 
mark and the perfect mark, whatever that may be. 
Thus, for example, if it had been decided to raise all 



166 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

of the marks given a particular group of pupils by 
20 per cent of the difference between each and the 
highest possible mark a mark of 60 on the percentile 
scale would be raised to 68 since 8 is 20 per cent of the 
difference between 60 and 100. This scheme is a little 
more difficult to use than either of the two previously 
mentioned and probably increases lower marks too 
much in proportion to the change in the higher ones. 
In the case just mentioned a mark of 90 would be 
raised only two points and one of 95 only one point 
as compared with an increase of eight for one of 60. 

A plan which avoids the objections to the three al- 
ready mentioned and, though involving some labor at 
first, may be used very readily later has been described 
by Weld (92). He recommends determining the type 
of distribution of marks given by each teacher, clas- 
sifying it as one of twenty standard types, and then 
transmuting her marks to standard marks according to 
a prepared table. The determination of the type to 
which a teacher belongs is based upon the proportion 
of her marks below passing and above 90. For ex- 
ample, a teacher who gives 20 per cent of her marks 
above 90 and 15 per cent below 70 belongs in Type 1. 
Her mark of 50 should be changed to 55, of 55 to 65, 
of 60 to 70 . . . and, finally, her marks from 95 to 100 
should all be made 100. A teacher who assigns 50 per 
cent of her marks above 90 and 8 per cent below 70 is 
in Type 8. Her mark of 50 becomes 54, 55 becomes 59, 
and so on. Similar bases of changing each of the twenty 
types to the standard distribution are given. All in 
all. Weld’s plan is probably better than any of the 
three previously described. 

On the whole it seems unwise to adopt any mechan- 
ical or automatic plan of this sort, however. In the 



THE DISTRIBUTION OF MARTCS 


167 


first place, those already mentioned and practically all 
other suggested plans of shifting marks because teach- 
ers mark too high or too low are only applicable to a 
percentile or perhaps some other numerical system. A 
second decided objection to them is that if teachers 
are aware of how their marks are to be shifted they 
can invalidate the procedure by giving such marTra 
that after the change the resulting marks will be what 
they would have given if no change were to follow. 
A third and important reason which renders doubtful 
the wisdom of any plan of increasing or decreasing all 
the marks given by an individual teacher according to 
any uniform numerical scheme is that the available 
evidence does not warrant the assumption that all 
teachers or even most of them are consistently high or 
low markers. Undoubtedly many are fairly consistent, 
as tabulations including their marks given over a con- 
siderable period of time show, but, on the other hand, 
many teachers are decidedly variable. Hulten (37), for 
example, presents evidence based on the scoring of 
compositions by twenty-eight English teachers which 
shows that few of these teachers were at all consistent 
in their markings. This statement holds true both when 
they were compared with one another, and when the 
two series of marks by each teacher were compared. 
In other words, much of the variability in teachers’ 
marks is not due to disagreement among the consist- 
ent tendencies of different teachers, but rather to 
temporary and more or less chance conditions which 
affect teachers’ mental attitudes and opinions of the 
work being done and thus cause the marks of a single 
teacher to show much variability. A teacher may be 
feeling unusually well and happy at one time and just 
the opposite at another. She may have a feeling that 



168 TRADITIONAIi EXAMINATIONS AND NEW-TYPE TESTS 

she has- been giving too Mgb marks and shonld there- 
fore be more severe, or vice versa. She may happen to 
have in mind the work of several of her best pnpils 
and ttse this as a basis of comparison at one time, 
whereas at another she has in mind the work of some 
of her worst pupils. She may be irritated at certain 
pupils because of their recent behavior or for some 
other reason, and kindly disposed toward others who 
have attracted her favorable attention somehow or 
other. These and many other causes result in a tend- 
ency to mark too high on certain occasions and too low 
on others, or even at practically the same time to 
mark some pupils too high and others too low. 

A rather striking illustration of the effect of such 
causes came to the writer’s attention some years ago. 
Some samples of pupils’ handwriting were being rated 
by five teachers according to the Ayres Handwriting 
Scale. This scale consists of specimens rated at the 
even tens from 20 to 90 inclusive and is often thought 
of as being essentially in percentile terms. It so hap- 
pened that a certain teacher rated the papers of about 
haK of the second-grade pupils after school on Friday 
afternoon, and those of the other half on the succeed- 
ing Monday afternoon. "When the ratings of the five 
teachers were brought together to be averaged it was 
noticed that although according to the judgment of 
the other four teachers there was practically no dif- 
ference in the average ratings of the papers of the sec- 
ond-grade pupils which this teacher had rated on Fri- 
day afternoon, and of those which she had rated on 
Monday afternoon, yet the average of her Monday 
afternoon ratings was practically 20 points higher than 
that of her Friday afternoon ones. She was asked con- 
cerning the matter and after reflecting for a few mo- 



THE DISTRIBUTION OF MARKS 169 

ments stated what she felt sure was the true explana- 
tion of the difference, though she had not realized it 
before. This was that on Friday afternoon she was not 
at all well, having even thought of leaving the school 
building without completing her day’s work because 
she felt so ill, and also was somewhat worried about 
the matter because she had been steadily getting worse 
for several days. Nevertheless, she start^ to rate the 
second-grade papers intending to do all of them, but 
felt unable to complete the task and so quit when she 
did. Over Saturday and Sunday her illness completely 
disappeared and on Monday, as she expressed it, she 
felt in as good health and spirits as she ever did in her 
life. This difference in her physical and mental con- 
dition resulted in an average difference in her ratings 
of about 20 points on the scale or the equivalent of the 
amount of improvement in handwriting expected of pu- 
pils during several years’ work in the elementary 
grades. 

Because of these and other reasons it is not recom- 
mended here as it is in some discussions that the best 
way to reduce the variability of marks is for some 
supervisory or administrative official to change the 
marks reported by teachers so that they conform to the 
desired distribution. It seems decidedly preferable to 
discuss the question of marks in teachers’ meetings 
and in cooperation to work out the general principles 
which are to be followed. There will probably be con- 
siderable disagreement among the teachers in any 
school or system over the question and probably the 
supervisory official in charge will need to impose his 
opinion to some extent at least. This of course should 
be done by persuasion and by trying to lead the 
teachers to see the appropriateness and wisdom of such 



170 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

pTineiples and regulations as are laid down. The su- 
pervisor is decidedly lacking iu ability and tact or the 
teachers are unusually conservative and opposed to 
change if the former caimot in the course of several 
teachers’ meetings lead most of them to partial or even 
complete acceptance of a set of desirable principles. 

In this connection it will probably be helpful to 
sketch the program outline by Starch, who has not 
only been prominent in showing the variability and 
unreliability of marks given by teachers but has like- 
wise given attention to the matter of reducing varia- 
bility. In one of his articles (79) he has suggested how 
this may be done and given figures which show the 
actual results of efforts to do- so in a particular situa- 
tion. His program includes three steps : first, the study 
and discussion in teachers’ meetings of the marks 
given by various teachers; second, the determination 
within departments of some common plan of scoring 
t3rpes of work and kinds of errors frequently met with 
in the subject in question; and, third, following more 
or less closely a fixed curve of distrilmtion. He gives 
a table showing the original ratings of a set of pupil 
compositions and likewise their later ratings by the 
same teachers, twelve in number, after a program such 
as he outlines has been carried out. In terms of points 
on the percentile scale the mean variation of the marks 
given the different compositions when first rated 
ranged from 1.7 to 6.3, the average being 4.2, whereas 
the mean variations of the marks on the second rating 
ranged from ,9 up to 4.6 with an average of only 2.8. 
This represents a reduction in mean variation of about 
one-third. The difference between the highest and low- 
est marks given the same papers averaged 19 points 
on the first rating and only 11.2 on the second, a de- 



THE DISTEIBOTION OF MARTTS I7I 

crease of some'wiiat more than one-third. It appears, 
though it is not absolutely clear from his article, that 
the reduction in variability was due almost entirely 
to suggesting to the teachers that they use a fixed 
curve in assigning marks. Apparently, therefore, if 
more attention had been devoted to his first two items 
in the program, that is, to the discussion of the matter 
in teachers’ meetings and to the determination of com- 
mon principles of marking by teachers of the same 
subject, a still greater reduction in variability might 
have followed. 

Although the manner and means of introducing and 
discussing the question of marking in teachers’ meet- 
ings are in a general way similar to those for any 
other topic, it will perhaps be worth while to offer some 
definite suggestions as to just how to approach the 
matter. Very frequently an effective introduction is 
accomplished by requesting the teachers concerned to 
mark a certain paper submitted to them independently 
of one another and then to tabulate the results so as to 
show the amount of disagreement. Sometimes it is 
profitable to repeat this procedure two or three times 
with different papers. In addition to this firsthand ex- 
perience the teachers should be made familiar with 
such studies as those of Starch and Elliott which show 
the difference between the marks of different teachers 
or of the same teachers at different times. After 
enough evidence of this sort has been accumulated to 
make the point it is probably well to call for general 
discussion and suggestions as to how to improve the 
situation. Evidence should be prepared showing the 
prevalence of the normal distribution, especially in 
mental measurements of pupils, and some attention 
given to the question of how nearly the pupils with 



173 .TRADITIONAL EXAMINATIONS AND NEW-T3!TE TESTS 

whom the teachers deal form normal or random groups. 
It will probably be necessary to discuss the signifi- 
cance of marks and the basis upon which they should 
be given, that is, the different factors of elements 
which would contribute to their determination. Simi- 
larly the question of just what distribution is best 
should be considered in different connections. It is 
often helpful to have at hand tabulations of all the 
marks given in the school or system in question dur- 
ing the past semester or year or perhaps for a longer 
period and also of all those given in other schools or 
systems which are fairly comparable with the local 
one. It will usually be found that the total distribu- 
tion of marks given by a considerable number of 
teachers will approximate normality and therefore in- 
dicate that on the average teachers believe school 
achievement or whatever is measured by school marks 
has a fairly normal distribution. After such a series 
of meetings and discussions the teachers should be 
ready to proceed to draw up and adopt general prin- 
ciples and specifications for their future guidance 
along the lines already suggested. 

If, as will often be the case, certain individual 
teachers refuse to conform within reasonable limits, 
individual conferences with them should be held and 
if extreme measures are necessary their grades may 
be shifted rather according to the judgment of the 
supervisor in charge, based upon the ideal distribution 
and knowledge of conditions, than according to any 
mechanical plan of transmutation. For example, if a 
teacher has given no A’s and too many failing grades, 
and is both unable to justify so doing and unwilling 
to shift them, her- principal should probably ask her 
to indicate- which of the- B’s represent the- best work 



♦ THE DISTRIBUTION OP MARK;!! . I73 

and change these to A’s, and so on with the other let- 
ters. On the other hand, if a teacher has given too 
many A’s she may he asked to indicate which of them 
represent the poorest work, these may be changed to 
B’s, and so on down. 

5. Summary. There has been considerable argument 
on both sides of the question of whether the normal 
or some other standard distribution of grades should 
be adopted and followed. The chief argument for so 
doing is that it provides the best practical means of 
insuring uniformity in the meaning of marks given by 
different teachers. Most of the arguments against such 
a plan will be found upon critical examination to be 
against its abuse or over-mechanical use rather than 
against the plan itself. Therefore most of the objec- 
tions raised can be avoided by proper formulation 
and administration of the proposed scheme. The dis- 
tribution adopted should not be applied with unvary- 
ing exactness, but rather should be regarded as an ap- 
proximate or general guide. It is best to state the 
proportion of each mark to be given in terms of limits 
rather than as a single fixed percentage. A large num- 
ber of suggestions as to how the marks in systems 
involving three to seven symbols should be distributed 
have been made. The plan recommended is that there 
be three passing and two failing marks with the fol- 
lowing limits : A’s, 5 to 20 per cent; B’s, 25 to 50 per 
cent; C’s 25 to 50 per cent; D’s and E’s combined, 
5 to 20 per cent. A very few A-1- ’s and C — ’s may be 
given. In assigning marks to pupils sectioned on the 
basis of ability, one should compromise between the 
principle that any given mark should be considered 
with regard to the amount of achievement for which it 
is given, and the second principle that marks should 



174 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

be given in accordance with the ability of the pupils 
in the group being marked. Therefore no low marks 
should be given in a superior group nor high marks in 
an inferior, but pupils located therein who appear to 
deserve them should be shifted in the proper direction, 
A similar policy of considering both factors in the 
situation should be followed in the case of bright but 
indolent pupils and also of dull but industrious ones. 
Still a third situation in which the same sort of com- 
promise may be applied is the assigning of marks in 
lower as compared with upper grades or classes. A 
number of more or less automatic plans of adjusting 
marks of teachers who do not conform to the standard 
have been suggested. It is recommended, however, that 
instead of allowing teachers to mark as they desire 
and then changing their marks, a program of educa- 
tion and discussion be carried out, especially at 
teachers’ meetings, and the teachers led to adopt and 
employ a satisfactory system. If a few teachers refuse 
to follow it satisfactorily, their inarks may be trans- 
muted as appears necessary. 



CHAPTER VII 


MEEITS AND LIMITATIONS OF TEADITIONAL 
AND NEW-TYPE EXAMINATIONS 

I. Merits and advantages of traditional, and limita- 
tions and disadvantages of new-type, examinations. 
Within the last few years a considerable amount of 
space and time has been devoted to condemning tra- 
ditional examinations and to showing or attempting 
to show that those of the new t3rpe are superior. Many 
valid points have been made but frequently, perhaps 
usually, the merits and advantages of traditional ex- 
aminations have been overlooked. All too often it has 
either been explicitly stated or else implied by the 
trend of the discussion that the abolition of tradi- 
tional examinations is desirable. It is very unfortu- 
nate that such an attitude should have been taken and 
expressed. It was only natural that the protagonists 
of new-type tests should in their enthusiasm over- 
estimate and overstate their value, but in many in- 
stances this has been done in such extreme fashion that 
little excuse therefor is apparent. A thoughtftil con- 
sideration of the question will undoubtedly lead to the 
conclusion that each of the two types has its peculiar 
points of strength and its distinct advantages in actual 
use. Therefore, as has already been stated, the writer 
believes most emphatically that a complete testing pro- 
gram of any teacher should include some use of both 
kind a. In other words, it is not a question of deciding 

175 



176 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

whetlier the essay or the new-type examination is the 
better and then of making exclusive use thereof, but 
rather of determining the occasions and circumstances 
under which each is most valuable and then employ- 
ing each in accordance therewith. In the hope of ac- 
complishing this purpose the comparative merits and 
limits of each will he discussed in this chapter. 

One of the advantages of the essay examination is 
that it is easier to make and to give than is the other 
type. Since it is composed of fewer questions or ex- 
ercises the teacher can prepare it in less time. It is 
probably true also that the degree of mental effort re- 
quired in its construction is less, although this may 
not hold if the person making it is equally familiar 
with the new-type test, and if the traditional examina- 
tion constructed is of as high a degree of merit as the 
other. Moreover in most cases it is at least reasonably 
satisfactory to write a few general discussion ques- 
tions upon the board or even sometimes to give them 
orally whereas new-type tests generally require that 
a copy be placed in the hands of each pupil in order 
to be effective. This requirement is frequently a very 
practical hindrance to their use since it is sometimes 
absolutely impossible and frequently decidedly difS- 
cult for a teacher to provide the necessary number 
of copies. Many schools, especially small ones, do not 
have any sort of device such as a mimeograph or 
hectograph by which a number of copies can be made. 
Even in the case of schools which do have such de- 
vices it is not always easy to secure the desired num- 
ber of copies. For a very small class it may be prac- 
ticable to use carbon copies made upon the typewriter, 
but for classes of ordinary size this is hardly practi- 
cable, requiring too great an amount of labor. 



MERITS AND LIMITATIONS OF EXAMINATIONS 177 

There is no doubt that up to the present time 
teachers on the ■whole are considerably more familiar 
with traditional than -with new-type examinations. Al- 
though much has been said and ■written concerning the 
latter, they have been scarcely heard of by many 
teachers and are not understood as to purposes, limita- 
tions, and administration by many others. It is true 
that it does not require a great deal of study on the 
part of a teacher to acquire a fair knowledge and un- 
derstanding of them, hut many teachers have been un- 
willing to put forth the requisite amount of effort 
even when their attention has been directed along this 
line. Until the many teachers of this sort are better 
trained and informed about new-type tests it is prob- 
ably unwise to attempt to compel or even induce them 
to make a large use of such tests. 

It seems probable that those who favor the tradi- 
tional examination are correct in their assertion that 
it tests reasoning and most other thought processes 
except memory better than do new-type tests. The lat- 
ter tend to measure only knowledge or facts acquired, 
and that often in rather disconnected fashion. Such 
qualities and mental activities as originality, initiative, 
organization, interpretation, analysis, discrimination, 
judgment, subtlety, and so forth, are, it is said, only 
slightly if at ail measured by new-type examinations. 
The essay examination, however, often allows a pupil 
a considerable degree of freedom in choosing the form 
of his answer, and not only allows but even requires 
him to select from a fairly large stock of information 
the portion which he ■wishes to use. Moreover, the 
situation is frequently such that many items which 
might be selected are neither absolutely right nor 
wrong. Thus judgment and other of these qualities 



178 TBADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

are called into play mucli more than on most new- 
type tests. It is further claimed that because of these 
facts new-type tests give an advantage to average or 
mediocre pupils. Such pupils are usually content 
merely to memorize the text and thus frequently make 
better scores upon tests calling for memorized facts 
than do superior pupils for whom such tests provide 
no outlet or means of expression for their originality 
and initiative. For the same reason essay examinations 
reveal certain facts concerning individual differences 
in the quality of mental activity which are not shown 
by those of the other variety. 

The claims just made for traditional examinations 
are, however, not fully valid. Frequently the time 
limits upon such examinations are so short that they 
really become chiefly tests of memory rather than of 
reasoning, organization, and other abilities, even 
though they might provoke activities of these sorts if 
sufficient time were given. Moreover, the actual exer- 
cises or questions which they contain frequently deal 
with factual material to just as great an extent as do 
those in new-type tests. Still further, several varieties 
of the latter do stimulate and measure pupils’ critical 
ability, their discrimination, judgment, and so forth. 
Such types as the true-false, which requires them to 
decide whether statements are true or not, the mul- 
tiple-answer, in which one or sometimes more correct 
answers must be selected from a group, and others 
can be made to serve these purposes. Even though 
new-type tests deal largely with separate points or 
facts, the material covered can if desired consist of 
general principles, rules, laws, and so forth, as well 
as mere bits of information. Therefore, although it 
is probable that traditional examinations do test , a 



MERITS ANT> LIMITATIONS OP EXAMINATIONS 179 

greater variety of mental processes than do objec- 
tive tests, it is not inevitable that they do so and the 
latter also can be made to measure, at least to some 
degree, most of these processes. 

In most varieties of the new examination pupils are 
in some form or other given a number of possible an- 
swers from which to select the correct ones. In other 
words, they are not thrown upon their own resources 
to the same extent as by discussion questions. Knowl- 
edge which is only marginal or hazy is frequently suffi- 
ciently quickened by the suggested answers that the 
correct responses are recognized whereas there is usu- 
ally no result of this sort in connection with the essay 
examination. The argument may be made, however, 
that it is not altogether undesirable that this be the 
case, that is, that .some tests secure more or less sug- 
gested responses. On the whole it appears that new- 
type tests are inferior to discussion examinations on 
this point, but not so far inferior as has sometimes 
been asserted. 

The essay examination, if properly administered, not 
only measures ability to organize and express ideas but 
also gives training in such ability. Even though it is 
not admitted that it is an important function of an ex- 
amination to give such training little if any objection 
can be raised to the incidental benefits of this sort 
which it can be made to yield. 

It is also claimed that the discussion type of exam- 
ination can be more easily and directly adapted to 
various kinds of subject-matter. This advantage should 
not be overemphasized since the large number of va- 
rieties of objective tests renders possible their adap- 
tation to many kinds of subject-matter, but stiU ap- 
pears, on the whole, to be present. 



180 a?RADITIONAI, EXAMINATIONS AND NEW-XTPE TESTS 

A strong objection commonly made to the new ex- 
amination is that it encourages guessing. This is espe- 
cially charged against alternative tests in which pu- 
pils hnow that they have one chance out of two of 
guessing right in each particular case, but also to 
some extent against multiple-answer tests, matching 
tests, and several other varieties. It is undoubtedly 
true that it is easier for pupils to give brief responses 
than to write rather long discussions if they know lit- 
tle or nothing about the matter at issue in either case. 
However, if the tests are properly administered, in- 
cluding satisfactory directions for the pupils, and also 
properly scored, it is not apparent that the amount of 
guessing which occurs is great enough to be a very 
serious fault. One item in satisfactory directions 
should be a statement strongly advising or directing 
pupils not to guess, that is, not to record an answer 
unless they are at least fairly sure it is correct. How- 
ever, if one wishes to obviate the possibility of pupils 
profiting by guessing in spite of the ordinary scoring 
methods to be described later, methods which on the 
whole prevent it from increasing their scores, he may 
well do something of the sort suggested by Christensen 
(14). This is that after one type of test, such, for ex- 
ample, as a true-false one, has been given, it be fol- 
lowed fairly soon, that is within a day or two, by one 
of another type, perhaps multiple-answer, covering 
the same material and even corresponding item for 
item with the first test. The two tests should then be 
scored together and credit given only for those items 
correctly answered in both. Even apart from its tend- 
ency to reduce guessing, such a repetition is occasion- 
ally desirable. 

The claim has been made that new-type examina- 



MEKITS AND LIMITATIONS OP EXAMINATIONS 181 

tions encourage greater dishonesty on the part of pu- 
pils because it is much easier to cheat upon them than 
upon those of the ordinary discussion type. There is 
some justification for this claim because the answers 
are so short that it is relatively easy to see the answer 
of a neighbor by a hasty look. Not infrequently, how- 
ever, in the case of an essay examination sufficient in- 
formation can be gained by a mere glance to enable 
a pupil to profit thereby. Nevertheless, on the whole 
it cannot be denied that there is some validity to this 
argument. If a teacher feels that the danger of cheat- 
ing is too great and that it cannot be avoided or con- 
trolled in any other way, it is possible to make use of 
the procedure suggested in the discussion of cheating 
given at the end of Chapter HI or of some other sim- 
ilar plan. 

An objection frequently urged against new-type 
tests is that they, or at least several of the most com- 
mon varieties of them, tend to confuse the pupil as 
to what he really knows or even to teach him erro- 
neous facts. This charge is especially made against 
true-false exercises since practically half of the state- 
ments contained therein are false. It is also made 
against the multiple-answer type, in which several of 
the suggested answers are incorrect ; against the 
matching type, in which the pupil may form wrong 
combinations which will tend to remain in his memory ; 
against the incorrect statement type for the same rea- 
son as in the case of the true-false type, and so forth. 
In reply to this criticism there are at least three 
points to be made. In the first place, material which the 
pupils have not already studied and supposedly learned 
in the correct form should not be covered or presented 
in tests. If something has already been well learned 



182 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

tMs knowledge will not fee disturbed or confused by 
seeing a false statement concerning the matter. In the 
second place, it may be true that some confusion re- 
garding facts only partially mastered may be caused, 
but this should be satisfactorily taken care of and 
corrected by the teacher in her discussion of the errors 
made upon the test. Finally it should be recognized 
that in life outside the school individuals are very 
frequently called upon to distinguish true statements 
from false ones, valid arguments from invalid ones, 
to select the best of several possibilities, or to do 
something else resembling very closely some variety 
or other of new-type tests. It is therefore highly de- 
sirable that some training along these lines be given 
pupils in school, and it is eminently worth while to 
risk the confusion of ideas and knowledge which may 
occur to a limited degree in the endeavor to avoid 
much more serious confusion later and to develop 
critical ability in the ordinary affairs of life. 

The statement has been made that objective or near- 
objective tests are too artificial in that they do not re- 
semble life’s situations or problems in one important 
particular. This is that the problems met with in life 
outside the school are such that there is rarely one 
and only one correct solution and all others wrong, 
but that instead there are frequently several solutions 
of approximately equal merit or perhaps several of 
which one is slightly better than another, the second 
slightly better than a third, and so on. It is therefore 
argued that pupils should not become accustomed to 
looking for answers or solutions which are absolutely 
right or wrong, but should be trained as much as pos- 
sible in dealing with situations in which all or prac- 
tically all of the elements or factors are subjective. 



MERITS AND LIMITATIONS OF EXAMINATIONS 183 

Although there is some truth in the contention just 
stated, it is not apparent that traditional examina- 
tions as ordinarily administered are of much if any 
more value in giving training of the kind desired than 
are new-type tests. Probably if traditional examina- 
tions were administered with this end in view they 
could be made to yield considerably greater returns 
along this line than is true at present and also greater 
ones than would come from objective tests. On the 
other hand, some varieties of the latter, such as the 
multiple-answer type with several answers of varying 
degrees of merit, do give training of the type speci- 
fied above. 

2. Merits and advantages of new-type, and limita- 
tions and disadvantages of tradition^, examinations. 
The merit of new-type examiuations which is prob- 
ably most often stated first is that they are more re- 
liable than those of the traditional variety. A consid- 
erable mass of data concerning which more will be 
said a little later has been accumulated in support of 
this statement. There appear to be two chief causes 
for the difference in reliability. One of these is that 
the typical essay examination contains comparatively 
few separate questions or exercises and that these are 
too few in number to constitute a satisfactory or re- 
liable sampling of pupils’ knowledge or achievement. 
Too few topics a,re covered and there is too much 
chance that these few will be among those which some 
pupils just happen to know and others not to know. 
This defect could be remedied by including a much 
larger number of questions, but so doing renders a 
traditional examination of too great length. On the 
other hand new-type tests permit pupils to respond 
to a great many more items or exercises tiban do 



184 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

traditional examinations consnining the same amount 
of time. Therefore they yield much better and more 
comprehensive samplings of pupils' achievement and 
BO result in more reliable marks. It is very unlikely, 
when a pupil must respond to a large number of ex- 

B C D E 


■ 

I- 

!■ 

1 . 

n 


Figuee 1 

Graphic Representation of Sampling Pupils’ 
Achievements by a Pew Questions 

ercises, each of which calls for a response more or 
less distinct from that of any other, that he will just 
happen to know or not to know a much larger propor- 
tion of them than is true for the total amount of sub- 
ject-matter covered by the test. 







MERITS AND LIMITATIONS OP EXAMINATIONS 


18S 


The fact that new-type examinations provide much 
better samplings of pupils’ knowledge than do tradi- 
tional ones may be shown by the accompanying figures 
which are similar to, but not identical with, those given 


A B C D E 



iFlGUEB 2 

Graphio EEapKESBaiTATiON OP Sampling Pupils’ 
ACHEEVEMBrNTS BT A LabGB NuMBjBE OP QUESTIONS 

by Russell ( 77 , pp. 15, 16, 24). The five rectangles in 
each figure represent five pupils, each of whom knows 
50 per cent of the material supposed to be covered by 
a given examination. As will be seen, however, the five 
pupils do not know the same 50 per cent. The dark por- 
tions of the rectangles represent the portions of the 



186 TRADITIONAIi EXAMINATIONS AND NEW-TYPE TESTS 

subject-matter known by the pupils, the light portions 
those not known. Thus pupil A, who is represented 
by the rectangle so labeled, knows the first half of the 
subject-matter, but is totally ignorant of the second 
half. Pupil B’s knowledge is scattered, being divided 
into five portions as shown. That of pupil C is likewise 
scattered and so. located that the portions of the 
subject-matter which he knows happen to be just those 
which pupil B does not know. Pupil D ’s knowledge is 
also scattered, but irregularly and in a different man- 
ner from that of either B or 0. That of pupil E is like- 
wise irregular and different from that of any of the 
others. 

In Figure 1 the four horizontal lines crossing the 
rectangles represent four general discussion or essay 
questions distributed at equal intervals throughout the 
subject-matter in the hope of getting a fair sampling 
of the pupils’ knowledge. It will be seen that in the 
case of A, two of the four questions would probably 
be answered satisfactorily. B would be able to answer 
none of them, 0 all four, D only one, and E three. Thus 
although all the five pupils had the same total amount 
of knowledge gained from the course, the marks made 
by them on such a four-question essay examination, 
counting 25 per cent on each, would be 50, 0, 100, 25, 
and 75 per cent, respectively. This, of course, is an ex- 
treme and unlikely example, but does illustrate a tend- 
ency of all examinations containing only a few ques- 
tions. 

In Figure 2 are five rectangles similar to those in 
the first figure and representing the same pupils, in- 
tersected by twenty equally distant lines representiug 
that number of questions equally distributed through- 
out the subject-matter. It will be seen that in the ease 



MERITS AND LIMITATIONS OF EXAMINATIONS 187 

of pupil A, ten of the twenty lines fall within the 
shaded area, or, in other words, ten of the twenty ques- 
tions fall within the limits of his knowledge. For B, 0, 
D, and E the fignires are respectively nine, eleven, 
eleven and ten. La other words the percentile marks, on 
the basis of 5 for each question, are 50, 45, 55, 55, and 
50, respectively. 

Although the five scores are not exactly the same, 
as they should be if the test in question were an abso- 
lutely satisfactory sampling of the material covered 
in the whole course, yet they differ so little that the 
errors in them are almost negligible, especially as com- 
pared Tvith those in the scores on the four-question ex- 
amination illustrated in the first figure. If the number 
of exercises were increased from twenty up to fifty, 
one hundred, or some still larger number, the agree- 
ment of the scores would become still closer until fi- 
nally a point would be reached at which exactly half of 
the questions would be included within the range of 
knowledge of each pupil. 

The second of the two chief causes for the higher 
reliability of new-type tests is that they are objective 
or nearly so in their scoring. The answers to most 
questions of the traditional type cannot be scored as 
definitely right or wrong, but may be partially right 
to almost any degree. As a result great differences of 
opinion exist among supposedly competent teachers 
as to how much credit should be allowed for the same 
answers. Most of the evidence which has been cited in 
Chapter I and elsewhere to show the unreliability of 
teachers’ marks has consisted of or been based upon 
the marks given traditional examination papers and 
therefore tends to prove the point just made. How- 
ever, since a great deal of attention has been.^ven this 



188 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

matter in educational discussions and literature, it 
seems appropriate to refer to several other studies 
and experiments which bear upon this point also. 

All of the data given in Chapter I had to do with dif- 
ferences between the marks -of different teachers, but 
similar evidence concerning those of the same teacher 
given at different times has also been offered. Douglas 
(i8, p. 368), for example, tells of an experiment in 
which twenty-eight American history examination pa- 
pers were marked by four high-sehool teachers of his- 
tory, and then re-marked by the same teachers several 
months later. The average differences in marks given 
the same paper by the same teacher were greater than 
5 per cent in the case of all four of those doing the 
marking. In one instance there was a difference of 25 
per cent between the two marks given the same paper 
by the same teacher. The examination used was, of 
course, of the traditional type. 

The unreliability in scoring responses to essay ex- 
aminations due to teachers’ subjectivity is only in part 
caused by disagreements among teachers as to just 
what the correct answers are. Much of it also results 
from the fact that teachers do not agree as to the rela- 
tive importance and therefore the weighting of the 
different parts of the examination. Some teachers at- 
tempt to weight according to supposed difficulty, others 
according to the importance of the facts or of the men- 
tal activities called for, and others on still different 
bases. Not only do their judgments differ as to how 
difficult the various exercises or questions are, but also 
as to how great weight should be assigned particular 
questions upon the relative difficulty of which they are 
agreed. Most teachers who determine weights on this 
basis count more upon the more difficult questions or 



MERITS AND LIMITATIONS OP EXAMINATIONS 189 

the ground that greater ability is required to ans-wer 
them correctly. Some, on the other hand, count more 
heavily on the easier questions because they believe 
that it is a greater discredit to pupils to be unable to 
answer these, and that therefore they should be penal- 
ized more heavily if they fail to do so. 

Still further, teachers are influenced in the marks 
which they give pupils’ written work by the pupils’ 
past records, by the general opinion which they have 
of the quality of their work, by handwriting, neatness, 
language usage, style, and so forth. In many cases 
teachers are unconscious that they are so influenced, 
but nevertheless the condition is very real and almost 
impossible to avoid. Moreover, the merit of a paper as 
a whole is likely to influence the marks given separate 
questions. If the answers to the first few questions 
have been very good, the marker is liable to rate any 
later poor answers too high. Similarly if the first few 
answers have possessed little merit, later good an- 
swers are liable to be discounted. A very striking ex- 
ample of the influence of extraneous factors has been 
given by Ballard ( 2 , p. 54). He relates that two stu- 
dents in an English training college, named respec- 
tively Smith and Jones, were close friends. They were 
both members of the same English class and as such 
had to hand in essays every fortnight. They consulted 
one another in their work, exchanged ideas, and pro- 
duced essays apparently very similar. The first essays 
were rated with a mark of “Very Good” upon Smith’s, 
and of “Very Pair” upon Jones’, and second, third, 
and succeeding essays received the same marks, except 
that Smith occasionally had his raised to “Very Good 
Indeed,” and Jones had his reduced to only “Pair.” 
One one occasion they exchanged and copied each 



190 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

other’s essays, that is, Jones handed in the essay 
really written by Smith, and Smith that written by 
Jones. Nevertheless each on the essay not his own re- 
ceived the same mark he had been receiving on his own 
work. Evidently the instructor had the firmly estab- 
lished idea that Smith could write essays of consider- 
able naerit whereas Jones could not. 

Two or three more references may be cited to show 
the high degree of unreliability of marks given tra- 
ditional examination papers and the higher reliability 
of those on new-type tests. Undoubtedly the most strik- 
ing illustration of the former with which the writer is 
familiar is one which has been mentioned a number of 
times in educational literature. This incident appears 
to have been first related by Wood ( 95 ) and is as fol- 
lows ; About half a dozen expert readers were marking 
a set of history papers. One of these readers for his 
own convenience prepared what he considered a model 
paper, that is, a paper containing supposedly correct 
answers to all the questions. By accident this model pa- 
per was included with the students’ papers and passed 
on to several other readers. They rated it on the sup- 
position that it was a student’s paper, and assigned 
marks to it varying from 40 to 90 per cent. 

Some rather convincing evidence concerning relia- 
bility is presented by Ruch, who is one of the leading 
advocates of the new type. In one place ( 74 , pp. 23- 
63), he gives figures showing the reliability coefficients 
of eight New York Regents’ Examinations in their 
ordinary or subjective form, and likewise of the same 
examinations when converted into objective form. The 
average coefficient of reliability in the second case, that 
is, for the objective form, was .65, whereas the aver- 
age for the subjective form corresponding to this was 



MERITS AND LIMITATIONS OF EXAMINATIONS 191 

only .42. If a correction were applied to balance the 
fact that pupils spent more time^ working on the 
subjective than on the objective form, the figure of 
.65 should be raised to .69. Further data of the same 
sort are presented by Gates ( 28 ), who obtained an 
average coefficient of reliability of .54 for true-false 
tests as compared with .35 for essay examinations. 

It would be easily possible to quote dozens, prob- 
ably even hundreds, of reports of results which agree 
with those of Euch and Gates, that is, show greater 
reliability for new-type than for traditional examina- 
tions. There are, however, a few eases in which data 
have been obtained which indicate an opposite con- 
clusion. Thus Crawford and Raynaldo ( 15 ), conclude 
from their experiments that fifteen out of twenty com- 
parisons indicate that traditional examinations pos- 
sess greater reliability than true-false tests. They 
state, however, that the true-false tests used were 
made by persons comparatively unskilled in so do- 
ing, also that the students upon whom the tests were 
tried out were not familiar with the true-false form. 
Furthermore, in all the comparisons except one the 
traditional examinations preceded the others, and in 
the one in which the true-false test was by accident 
given first, it showed itself distinctly more reliable 
than the following discussion examination. After men- 

1 The reason or .justification for making such a correction is as follows : If 
a test is lengthened by adding more of the same type of exercises as compose 
the original portion and other conditions are in no way changed, its reliability 
is increased. This increase is due to the fact that making it longer causes it to 
yi^d a more satisfactory sampling of the total field covered. Such an increase 
in the length of a test results in increasing its reliability by a ratio equal to the 
square root of the ratio of its length after ^e additional exercises have been added 
to what it was in the first place. For example, if enough similar material is added 
to a test to make it four times as long as it was originally, the reliability of 
the lengthened test is twice as great as that of the first one, since the square root 
of four is two. 



192 TRADITIONAL EXAMINATIONS AITD NEW-TYPE TESTS 

tioning several other factors, their conclusion is that 
the data they present are not sufficient to warrant a 
general statement as to which type of test is superior. 

Even when teachers are aware of the large element 
of variahility commonly present in their marks and 
endeavor to reduce it by careful marking, they are 
unable to do so to a satisfactory extent. Some reduc- 
tion undoubtedly can be made, but Euch (71, pp. 55- 
62), as well as others, offers experimental evidence 
which shows that even teachers who are aware of the 
larger sources of unreliability and endeavor to avoid 
them, still disagree markedly, in a few cases even over 
50 per cent, as to the marks to be assigned papers. 

Another similar example may be cited from the 
writer’s own experience. On a number of occasions he 
has submitted an algebra paper which he prepared 
to members of university classes in Methods of Teach- 
ing and in Educational Tests and Measurements, with 
the request that each student mark the paper. The 
paper contains five exercises and in each the work is 
carried through to completion or practical comple- 
tion, but some one slight slip has been made. This 
probably renders the paper more difficult to mark than 
most papers actually written by pupils. Furthermore 
the persons who were doing the marking were not 
in general trained or experienced teachers in mathe- 
matics. They did, however, know something of the 
causes of subjectivity and unreliability. Therefore it 
would naturally be expected that the variability would 
be greater than in the case of a paper written by a 
pupil and marked by a number of mathematics teachers 
actually in service. However, the results, even allow- 
ing for. these conditions not representative of typical 
school practice, are striking. Marks ranging all the 



MERITS AND LIMITATIONS OF EXAIONATIONS 193 

way from zero to 100 per cent inclusive have been 
given the paper. Both extremes have never been rep- 
resented in the same group of students, though on one 
occasion the marks of a single group ran from zero 
to 90, and on another from 20 to 100, per cent. 

To sum up the situation, there seems to be little 
reasonable doubt that if new-type tests and discus- 
sion examinations are constructed with the same de- 
gree of care and expertness, and if the pupils spend 
the same amount of time working on each, the results 
on the former will be decidedly more reliable than 
those on the latter. It is probable that if several varie- 
ties of the new-t 3 rpe examination are combined into 
one test, the resulting reliability of the total scores 
will be even greater than if only one form is used. In 
many of the investigations the comparisons have been 
made on the basis of a single form only, or, if on sev- 
eral forms, the figures have been reported separately 
for each. 

A disadvantage of the traditional examination con- 
nected with unreliability, but yet worth mentioning 
separately, is that pupils often realize that the marks 
they receive are to some extent due to chance or other 
causes which should not be operative. Not infrequently 
when pupils compare papers after receiving them back 
marked, they find that answers containing almost ex- 
actly the same material have received different num- 
bers of points, or even that an answer containing more 
of the facts called for than does another has been 
marked the same or sometimes lower. Pupils feel, 
therefore, that many of the ratings they receive are 
really unreliable jmd unjust. On the other hand, the 
comparatively high objectivity of new-type tests pro- 
duces a distinctly favorable reaction. It renders the 



104 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


pupils much more satisfied with the marks which they 
receive, and enables teachers to justify marks to pu- 
pils and their parents much more easily. When pupils 
compare papers with one another they see that the 
same response has been scored in the same way no 
matter on whose paper it occurred and thus their con- 
fidence in the meaning of marks and the reliability of 
those they receive is increased. Furthermore, the qual- 
ity of near-objectivity renders it possible for pupils 
to score their own answers or those of one another on 
many occasions, and thus both to save the teacher con- 
siderable work and themselves to derive much profit 
from the exercise. 

Not only because of objectivity in scoring, but for 
other reasons also pupils tend to prefer new-type 
tests. It is true that much depends upon the attitude 
of the teacher and how the tests are presented to the 
pupils, but if the teacher is not prejudiced against the 
new type, pupils ■will almost always favor them. Kinder 
( 42 ), for example, reports that of more than 200 stu- 
dents, all but seven preferred the new type. May ( 52 ) 
states that out of 260 pupils, only two raised objec- 
tions to them. Bardy ( 3 , pp. 109-10) writes that 231 
out of 242 favored them. Kolstoe’s report ( 43 ), how- 
ever, is not quite so favorable. He found that of about 
300 teachers’ college students approximately 30 pre- 
ferred the essay type, also that about 75 did not be- 
lieve that true-false tests were satisfactory. Brinkley 
( 7 , p. 59), who in general is not as favorable to new- 
type tests as are most writters on the subject, found 
that pupils preferred a mixture of old and new to all 
of either one alone. Gates ( 28 ) has also made a study 
of the matter and goes into more detail in reporting 
reasons why new-type examinations are preferred thaili 



MEEITS AND LIMITATIONS OP EXAMINATIONS 195 

do most of the others. The reasons he gives include 
the fact that results can usually be known soon after 
the tests are taken, that less nervousness and fear are 
aroused, that there is little danger of answers being 
misunderstood, that the personal likes or dislikes of 
teachers have practically no opportunity to affect 
scores, and that there is no visual or .writing strain. 

A fact more or less dependent upon reliability and 
yet distinct from it is that new-type tests possess 
greater validity than do discussion examinations. This 
is caused by their greater objectivity and reliability, 
and also by the fact that the pupils’ answers are very 
little affected by such factors as their ability in Eng- 
lish, handwriting, and so forth. That is to say the an- 
swers are indicative of their knowledge of the subject- 
matter covered, and not of extraneous abilities which 
may enter into their answers on essay examinations. 
Many of the same writers who have dealt with the 
question of reliability have also submitted data re- 
garding validity, as also have others. Wood (95), for 
example, reports coeflScients of correlation with a cri- 
terion made up of seven separate measures of about 
.85 for new-type tests whereas for an essay examina- 
tion lasting practically the same time, the correspond- 
ing figure was only about .55. McAfee (47) likewise 
obtained similar results though the difference was less. 
He found correlations of .75 and .79 for new-type tests 
with a composite measure composed of both new and 
traditional examination marks, standardized test 
scores, and teachers’ marks. The correlation of dis- 
cussion examination marks with the same composite 
was only .66. 

It is true in the case of validity as in that of reli- 
ability that not all those who have studied the ques- 



196 TRADITIONAL EXAMINATION'S AITO NEW-TYPE TESTS 

tion are in entire agreement. One of the most careful 
investigations reported is that of Brinkley, who 
reached the conclusion ( 7 , p. 58 that with tests of 
equal length, as measured by time spent in testing, 
and prepared by teachers with training in the matter 
of test construction, one type of test yielded prac- 
tically as good results as another for measuring senior 
high-school achievement in history. With one or two ex- 
ceptions, he found this true whether the achievement 
measured was general achievement for the course, abil- 
ity to think with the materials of the course, or in- 
formation. He also states that for measuring general 
achievement in history essay examinations are more 
valid than new-type tests prepared by ordinary high- 
school teachers, and even slightly more valid than 
those prepared by teachers trained in the construc- 
tion of the new type. In the case of new-type tests 
prepared by Brinkley himself the validity equaled that 
of essay examinations. For measuring ability to think 
he found that the two types possessed about the same 
validity, and for measuring stock of information that 
the new type was slightly more valid. 

It seeuas to the writer that no general conclusion 
can be drawn as to which type of examination is more 
valid. The purpose which an examination is intended 
to serve must be taken into account. For the measure- 
ment of stock of information and knowledge of facts, 
the evidence seems to support the statement that the 
new examination is more valid than the discussion 
type. For the measurement of other outcomes of in- 
struction, the data available at present do not warrant 
the statement that the new-type examination is known 
to be superior to the traditional type. In other words 



MERITS AND LIMITATIONS OP EXAMINATIONS 197 

each has its particular place and its special functions 
where it should be preferred to the other. 

Since scores upon objective or near-objective tests 
are helped very little by knowledge that has some- 
thing to do with the point at issue but does not specifi- 
cally include it, their use tends to lead pupils to ac- 
quire relatively exact and detailed knowledge. Also 
new-type examinations point out very definitely the 
particular things which are not known and thus pave 
the way for very definite and purposeful remedial in- 
struction. An essay examination may reveal a general 
lack of knowledge on a certain topic, but it rarely 
points out the exact points which need attention. New- 
type tests not only aid the teacher in diagnostic and 
remedial work, but make it easier for pupils to check 
up on the results of their own study. It is not very 
difficult for a pupil to determine whether he knows 
certain facts definitely or not, and if he finds that he 
is ignorant of some of them, to devote further study 
to those not known. Therefore new-type tests provide 
better motivation for study than do discussion exam- 
inations. 

To some extent traditional examinations discour- 
age systematic and worth-while review. Since they gen- 
erally touch upon only a few topics out of the large 
number included in a course, pupUs are likely to take 
a chance that the few which such tests cover will hap- 
pen to be among those that they think they know fairly 
well. Sometimes pupils do not do this, but try to guess 
what topics will be dealt with on the examination and 
then review intensively on those, neglecting all others. 
Others make little or no attempt to study because they 
thiuTr it too much a matter of chance whether or not 



198 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

doing SO will aid them, materially in responding to 
the examination questions. 

Another disadvantage of the traditional examination 
is that it does not test the achievements of pupils 
whose powers of expression are poor. That is to say 
because of difficulty in organizing their thoughts and 
expressing them in clear language, pupils may know 
more about the question or topic to be discussed than 
they indicate in their responses. Thus their answers 
depend to some extent upon their ability and knowl- 
edge along other lines than the subject being tested. 
It is certainly desirable to test ability in expression, 
but it should not be done in such a way that a pupil’s 
mark in history or algebra, for example, is a compound 
mark expressing a mixture of his achievement therein 
and also in language, with the proportions of the two 
which enter into it unknown. 

More or less similar to the disadvantage just men- 
tioned is that traditional examinations frequently test 
speed of writing to an undesirable extent. Some pu- 
pils’ rates of writing and freedom from fatigue while 
writing may be enough greater than those of other 
pupils, whose actual ability and achievement in the 
subject being tested is the same, to make very ma- 
terial differences in the marks which they receive upon 
their examination papers. This can easily be avoided by 
allowing sufficient time for all pupils to finish, but, 
as will be shown later, doing so often leads to certain 
other undesirable results. 

Another limitation likewise closely coimected with 
those just stated is that too much of the time spent 
in answering essay examination questions is ordina- 
rily devoted to what may be called the mere mechanics 
of answering. That is to say, the act of writing and 



MEKITS AND ’LIMITATIONS OF EXAMINATIONS 199 

the determination of the form of answers consume 
much of the pupil’s time and attention which should 
be devoted to real thinking about the questions asked. 

Because of the fact that language and handwriting 
abilities play such a large part in pupils’ answers to 
discussion examinations, and further because if suffi- 
cient time is given to these matters, there is frequently 
not enough time left to devote the proper amount of 
attention to the subject-matter itself, pupils tend to 
develop hurried and careless habits of expression and 
•writing. They hasten to put down on their papers what- 
ever pertinent facts they know and pay little attention 
to the form in which they are expressed. This effect 
is rendered still worse by the fact that many teachers, 
especially high-school teachers of other subjects than 
English, pay little attention to such matters as spell- 
ing, .composition, punctuation, sentence structure, qual- 
ity of handwriting, and so forth. Even if they do cor- 
rect mistakes along these lines, the attention of the 
pupils is not usually called to these corrections in sudi 
a way as to make them very effective. Because of these 
facts it may even be said that many essay examina- 
tions give positive training in imgrammatical and un- 
rhetorical expression, poor handwriting, and other 
undesirable habits. 

It is practically impossible to give essay examina- 
tions so that they test the rate of a pupil’s response 
or thinking in the subject dealt •with. They frequently 
serve to test rate of writing, but not rate of mental 
activity. It is not necessary or even highly desirable 
that all examinations should measure rate, but it 
would be very unfortunate if none did so. In prac- 
tically every activity outside of school life, the person 
who can perform a task as well as another and do so 



200 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

in less time is rated as more efificient, and the same 
shonld be true in mueh of the rating of school pnpUs 
and their work. In actual practice, however, essay ex- 
aminations as administered and marked have fre- 
quently tended to produce the impression that correct 
answers are equally valuable whether given in a short 
or in a long time. 

One point in which new-type examinations possess 
considerable advantage over traditional ones is in the 
ease of scoring. Careful and accurate scoring of the 
answers to traditional questions is relatively difiScult 
and requires considerable time. In the case of many 
already overworked teachers the result is that this 
added burden in addition to their other duties is suffi- 
cient to result in a lowering of their general physical 
and mental vitality and therefore of their teaching 
efficiency. Some teachers avoid this result by reduc- 
ing the number of examinations below a desirable 
minimum and others by scoring pupils’ responses so 
hurriedly and carelessly as to lose many of the pos- 
sible benefits to be derived from giving examinations. 
By the preparation and use of a list of correct answers 
which can usually be put in such form that they can 
be matched with the pupils’ responses, the scoring of 
new-type tests is rendered easy. Not only is time saved 
but the type of mental activity engaged in while scor- 
ing is much less arduous and tiring than is true in 
the case of essay examinations. If it is desired and 
practicable, some clerk or other person who does not 
possess any particular knowledge of the subject-mat- 
ter covered can score most new-type examinations sat- 
isfactorily. It is also frequently possible to have satis- 
factory scoring done by the pupils themselves, who 
may mark their own papers or those of other pupils. 



MERITS AND LIMITATIONS OE EXAMINATIONS 201 

They can easily see just what their errors are and also 
learn how to correct these errors. In most cases it is 
not necessary for the teacher to give very much help 
in this respect if the pupils are properly motivated so 
that they have formed the habit of studying their re- 
turned test papers and trying to profit as much as 
possible thereby. They will ordinarily gain much more 
benefit from unaided or only slightly aided study of 
new-type test papers than from that of traditional ex- 
amination papers. It is impossible for pupils to de- 
ceive themselves as to the correctness and quality of 
their answers, as is frequently the case with discussion 
examinations even though the papers have been well 
criticized and marked by the teacher. 

It is usually difficult if not absolutely impossible to 
secure satisfactory norms from essay examinations. 
The difficulty of doing so places very decided limita- 
tions upon the possibilities of comparing achievement 
in different classes or groups of pupils and thereby 
renders it harder for teachers and others to learn 
whether the achievements of their pupils are equal 
to what they should be or not. 

One of the advantages stated for traditional ex- 
aminations was that they are easier to make than 
the new-t37pe ones. This very fact, however, produces 
results favorable to the latter. Because it is compara- 
tively easy to dash off a few discussion questions in 
almost as short a time as is required to write them, 
many teachers fall into the habit of doing so and of 
giving little or no thought to the selection and formula- 
tion of the questions and exercises employed. As a re- 
sult important topics and portions of the subject- 
matter studied are often entirely or almost entirely 
neglected whereas others are dealt with much more 



202 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

frequently than there is any need for. Hastily made 
questions are liable to he poorly worded, obscure, and 
indefinite, with the result that teachers in scoring 
either penalize pupils who cannot understand the ques- 
tions, although the fault is their own, or else give 
credit for answers which are really not what was 
wanted. Furthermore, such careless formulation of 
questions serves to increase the unreliability of marks 
because of securing poorer samplings of the subject- 
matter covered. On the other hand, the fact that the 
construction of a fairly large number of new-type ex- 
ercises or items calls for the expenditure of more 
thought than that of a few essay questions frequently 
causes teachers to be more careful and thoughtful in 
so doing. This of course leads to the result that more 
time is required to construct an objective or near- 
objective test than for an essay examination consum- 
ing the same time. For a very small class this extra 
amount of time will ordinarily more than offset that 
gained in scoring, but for a class of twenty-five, thirty 
or forty this will rarely occur and it is likely that the 
time saved in scoring new-type tests will either balance 
or more than balance the extra amount required in 
their construction. Even if the total amount of time 
required is the same this should be considered a merit 
of new-type examinations because a greater propor- 
tion of it is spent on construction and less on scoring. 
In other words, a teacher spends more time in giving 
consideration to her general objectives, methods, and 
so forth, and less in what is largely drudgery and 
mere clerical work. Therefore the quality of examina- 
tions should be improved because of this fact. 

It is frequently fairly easy for pupils, especially 
those of more than average intelligence, to bluff on 



MERITS AND LIMITATIONS OF EXAMINATIONS 203 

essay exammations. A pupil may know nothing or 
practically nothing of what is actually called for by 
a particular question, but if he has some knowledge 
of the general topic with which the question is con- 
nected and perhaps also some skill in guessing, he 
can frequently produce an answer for which he will 
receive much more credit than he deserves. This is 
especially true in cases where the teacher is marking 
the papers hurriedly and carelessly. She is liable to 
notice that the pupil has written an answer of consid- 
erable length, and that it contains a number of words 
and expressions which have something to do with the 
topic, and therefore without careful examination of 
what is written, give him a fairly good mark upon it. 

It is possible to prepare two or more new-type tests 
over the same subject-matter which are very nearly 
equivalent in diflSculty, a thing which is practically 
impossible with essay examinations. If it is desired 
to give a new-type test of, say, forty items and to have 
two forms of the test, eighty items should be prepared. 
These should then be divided by some random or 
chance method into two lists of forty each. The two 
tests will in most cases not be of exactly the same dif- 
ficulty, but it wiU be unusual if the difference in dif- 
ficulty between them is more than a very few per 
cent. Moreover in many cases new-type examinations 
can be used over again with comparatively slight modi- 
fications. Since the number of items contained is com- 
paratively large, pupils cannot expect to make high 
scores by studying up upon a very small portion of 
subject-matter, as would be the ease if only a few dis- 
cussion questions were to be repeated. 

3. Summary. Although there has been much recent 
unfavorable criticism of traditional examinations, they 



204 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

should not be entirely discarded in favor of new-type 
tests but should be used on some occasions. Each of the 
two general types just mentioned has its peculiar 
merits and advantages and should be employed when 
it best fulfills the desired end. Traditional examina- 
tions are usually much easier to prepare, test a num- 
ber of mental processes better than does the new type, 
do not offer as great opportunity for guessing and 
perhaps not for cheating, are not as liable to the dan- 
ger of confusing the pupil, and in several minor ways 
are to be preferred. On the other hand, new-type tests 
are, as is shown by a considerable mass of evidence, 
more reliable than traditional examinations both be- 
cause they secure better samplings of pupils’ ability 
and knowledge and because their scoring is relatively 
objective. Among the other advantages which they pos- 
sess are that pupils usually prefer them and are bet- 
ter satisfied with the marks which they receive, em- 
phasis is placed upon exact and accurate knowledge, 
knowledge of the subject being tested is measured 
without being mixed with ability in language, hand- 
writing, and so forth, speed can be measured when 
desired, scoring is easier, more thought is usually re- 
quired in their construction, bluffing is more difficult, 
and two or more forms of practically equivalent diffi- 
culty can be prepared. 


i 



CHAPTER VIII 


EXAMPLES OF TRADITIONAL OR ESSAY 
EXAMINATIONS 

I. Types of mental activity to be tested. Before pro- 
ceeding to give examples wMcli illustrate tlie best prac- 
tice in the construction of discussion or essay examina- 
tions, it is appropriate to consider the types of mental 
activity which it is desirable to measure with such 
examinations. It is not meant to imply that thought 
processes of some of the types mentioned should and 
may not he measured by varieties of the new examina- 
tion, but merely that they should be considered in con- 
structing those of the essay type. Probably the best 
analysis of questions to which reference can be made 
in this coimection is to be found in a study by Monroe 
and Carter (6o). This investigation dealt with the 
use of thought questions in high schools, and for this 
purpose the following twenty types of such questions 
were defined : 

1. Selective recall — ^basis given. 

2. Evaluating recall — ^basis given. 

3. Comparison of two things — on a single designated basis. 

4. Comparison of two things — ^in general. 

5. Decision — ^for or against. 

6. Causes or effects. 

7. Explanation of the use or exact meaning of some phrase 
or statement in a passage. 

8. Summary of some unit of the text or of some article read. 

9. Analysis. (The word itself is seldom involved in the 
question.) 

205 



206 TRADITIONAL EXAAnNATIONS AND NEW-TYRE TESTS 


10. Statement of relationships. 

11. Illustrations or examples (your own) of principles in 
science, construction in language, etc. 

12. Classification. (Usually the converse of No. 11.) 

13. Application of rules or principles in new situations. 

14. Discussion. 

15. Statement of aim — author’s purpose in his selection or 
organization of material 

16. Criticism — as to the adequacy, correctness, or relevancy 
of a printed statement, or a classmate’s answer to a question 
on the lesson. 

17. Outline. 

18. Reorganization of facts. (A good type of review ques- 
tion to give training in organization.) The student is asked 
for reports where facts from different organizations are ar- 
ranged on an entirely new basis. 

19. Formulation of new questions — ^problems and questions 
raised. 

20. New methods of procedure. 

As Monroe and Carter state, tMs list is not intended 
to be absolutely eidbaustive, but it appears to be suffi- 
ciently detailed to serve the purpose. Indeed, certain 
combinations might be made without reducing its use- 
fulness. 

If one studies the twenty types of thought questions 
just named, keeping in mind the possibilities of con- 
structing questions appropriate for each, it will be 
seen that in some instances at least both essay and 
new-type questions may be devised which will be fairly 
satisfactory for the purpose. On the whole, however, 
it is evident that in the eases of most of them it is 
easier to prepare discussion than new-type questions 
which are appropriate to these types of mental activ- 
ity, and that in those of some it is almost if not abso- 
lutely impossible to eonstimet satisfactory new-type 
questions. In the following section will be found ex- 



EXAMPLES OF TRADITIONAL EXAMINATIONS 207 

amples of qnestions or exercises calKng for each of the 
twenty types of mental activity just named. 

2. Examples of essay questions calling for the twenty 
t3^es of mental activity named above. Some of the ques- 
tions and exercises given below have been taken from 
the bulletin by Monroe and Carter, in which they are 
cited as illustrations of the types. Others have been 
collected from various sources by the present writer, 
and a very few are original with him. 

1. Selective recall — ^basis given. 

Name the presidents of the United States who had been 
in military life before they were elected. 

What do New Zealand and Australia sell in Europe that 
may interfere with our market ? 

2. Evaluating recall — ^basis given. 

Which do you consider the three most important Amer- 
ican inventions in the nineteenth century from the stand- 
point of expansion and growth of transportation? 
Name the three statesmen who have had the greatest in- 
fluence on economic legislation in the United States. 

3. Comparison of two things — on a single, designated basis. 
Compare Eliot and Thackeray as to ability in character 
delineation. 

Compare the armies of the North and South in the Civil 
War as to leadership. 

4. Comparison of two things — ^in general. 

Compare the early settlers of the Massachusets Colony 
with those of the Vir^nia Colony. 

Contrast the life of Silas Marner in Raveloe with his 
life in Lantern Yard. 

5. Decision — ^for or against. 

Whom do you admire more, Washington or Lincoln? 
Why? 

In which in your opinion can you do better, oral or 
written examinations? Why? 

6. Causes or effects. 

Why has the Senate become a much more powerful body 
than the House of Representatives? 



208 THADITIOi^AL EXAMINATIONS AND NEW-TYPE TESTS 


What caused Silas Marner to change from what he was 
in Lantern Yard to what he was in Eaveloe? 

7. Explanation of the use or exact meaning of some phrase 
or statement in a passage. 

Explain the meaning of the expression “Sinais climb 
in the line: ^‘W'e Sinais climb and know it not.’’ 

Explain the meaning of the word '‘original’^ in the 
statement: ‘‘The Supreme Court has original jurisdic- 
tion only in cases wherein a state or diplomatic represen- 
tative is a party.” 

8. Summary of some unit of the text or of some article read. 
Summarize in about one hundred words the advantages 
of the hot-air furnace. 

Summarize in not more than one page what is to be 
found in the text concerning reconstruction in the South 
after the Civil War. 

9. Analysis. 

What characteristics of Silas Marner make you under- 
stand why Raveloe people were suspicious of him? 
Mention several qualities of leadership, 

10. Statement of relationships. 

Why is a knowledge of botany helpful in studying ag- 
riculture? 

Tell the relation of exercise to good health. 

11. Illustrations or examples (your own) of principles in 
science, construction in language, etc. 

Give an original sentence in Latin illustrating the use 
of the infinitive in indirect discourse. 

Show how some one phenomenon with which we are fa- 
miliar in everyday life illustrates the fact that heat 
commonly causes expansion. 

12. Classification. (Usually the converse of No. 11.) 

What is the construction of the word “me” in the sen- 
tence “The boy gave me his book”? 

To what group of plants do the mosses and liverworts 
belong? 

13. Application of rules or principles in new situations. 

In what countries other than Brazil would you expect 
to find rubber plantations? 

What chemical properties would you expect phosphorus 



EXAMPLES OF TRADITIOXAL EXAMINATIONS 


209 


to possess, knowing that its position in the periodic table 
is below that of nitrogen ? 

14. Discussion. 

Discuss the Monroe Doctrine. 

Discuss early American literature as best you can in 
about two hundred and fifty words. 

15. Statement of aim — author’s purpose in his selection or 
organization of material. 

What was the purpose of the author in having Athel- 
stane return to life after he was apparently dead? 
Why do you suppose the author of this history relates 
the story of Barbara Frietchie when her act had no bear- 
ing on the outcome of the war or even of any particular 
battle ? 

16. Criticism — ^as to the adequacy, correctness, or relevancy 
of a printed statement, or a classmate’s answer to a question 
on the lesson. 

Do you believe that the following statement is true? 
Give your reasons. ^^The South would have won the Civil 
War if it had possessed an adequate navy.” 

Criticize ^^Macbelh was wholly indifferent to the super- 
stitions of his time.” 

17. Outline. 

Outline in not more than one page the chief events of 
the French and Indian Wars. 

Outline the Constitution of the United States, including 
the amendments passed very shortly after the adoption 
of the Constitution. 

18. Eeorganization of facts. 

Trace the life of a glacier from its beginning, showing 
the steps in its origin, its development, etc., down to its 
destruction. 

Select the incidents which characterize Portia in the 
Merchant of Venice. 

19. Formulation of new questions- — problems and questions 
raised 

In addition to what is stated in your text, what other 
(Questions can you think of which need to be answered to 
explain why some portions of the earth’s surface are 
higher than others t 



210 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


If you were asked to state how mucli you could trust 
the viewpoint of a particular historian about whom you 
know little or nothing, what questions would you want 
to have answered concerning him? 

20. New methods of procedure. 

How might the plot of Julius Ccesar be changed to make 
it a comedy rather than a tragedy ? 

How would you prove or disprove the statement that oil 
is a more efficient heating fuel than soft coal? 

3. Examples of good discussion questions in literature. 
The following questions, taken from an article by 
Achtenhagen (i), are representative of the best type 
of discussion questions in the field of literature. The 
first three of them call for a review of material al- 
ready studied, but are also thought-provoking and 
opinion-forming. The questions are as follows : 

1. you agree with Bacon when he says that there are 
certain things which ought to be privileged from jest? 

What are your reasons ? ^ ^ 

2. ^‘Have the Songs of Labor helped you in any way? 
Did you come across an idea that influenced you, a line that 
seemed to fit your case, or a characteristic worth imitating?” 

3. ^'It has been said that literature can do for you the 
following: 

(a) Express your thoughts and feelings. 

(b) Keep before you an ideal. 

(c) Give you a i^owledge of human nature. 

(d) Restore the past to you. 

(e) Show you the glory of the commonplace. 

Using concrete illustrations from your work during the 
past semester, tell to what extent it has done each of 
these for you.” 

The next two are of the type which may be used 
to effect a combination of what has already been 
studied with new material. 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


21 ] 


1. “State the philosophy which is expressed in each of 
these quotations. (Three, taken from unfamiliar matter, are 
given.) Which author you have already studied had the same 
philosophy? If you can quote a line to prove the similarity, 
do so.” 

2. “Interpret this statement: ‘We are incredibly heedless 
in the formation of our beliefs, but find ourselves filled with 
an illicit passion for them when any one proposes to rob us 
of their companionship.’ — J. H. Eobinson. Have any of the 
poems or class discussions caused you to change your beliefs? 
Was Mr. Robinson’s theory true in your ease?” 

There are many possibilities in formulating ques- 
tions dealing with entirely new material. One of the 
most profitable of these is illustrated by the two ques- 
tions below, which call for the interpretation of selec- 
tions. 

1. “Interpret this poem (‘Victory in Defeat,' by Mark- 
ham). Do you agree with the author? What are your 
reasons!” 

2. “Emerson’s theory of books is as follows; interpret it 
and give your opinion on the same question. ‘It came into 
him life; it went out from him truth. It came to him short- 
lived actions; it went from him immortal thoughts. It came 
to him business ; it went from him poetry. It was dead fact ; 
now it is quick thought. Precisely in proportion to the depth 
of mind &om which it issued, so high does it soar, so long 
does it sing.’ ” 

4. A completion essay examination. Completion ex- 
aminations are usually considered as being of the new 
type, but it is possible to have them of the traditional 
type also. In such cases, as is shown by the example 
below, the portion to be supplied by pupils is com- 
posed not merely of single words but of larger units 
of thought, even whole paragraphs or more. Such ex- 
aminations are very rarely used, but they appear to 



212 TRADITIONAL EXA^HNATIONS AND NEW-TYPE TESTS 

offer decidedly valuable opportunities for testing men- 
tal activities and thought processes of a rather high 
order. The one given below deals with a college or 
university subject, philosophy, and is rather long, but 
nevertheless is quoted in full because it is the best 
example of this type of examination which the writer 
has ever seen. It was actually used in one of our well- 
known eastern colleges. The examination itself as 
given below is explained by a few sentences which are 
taken verbatim from Cahdns’ article (ii) contain- 
ing it. 

^‘The italicized portions reproduce the examination paper 
(chiefly the work of Miss Flora I. MacKinnon) as it was 
placed before the students. Only the question numbers and 
the mechanical directions are omitted. The unitalicized parts 
are taken verbatim from the examination books of several 
students. The editor has with difficulty selected this material 
from a much larger number of ^returns’ almost equally well 
suited to her purpose. She has often omitted and sometimes 
transposed but has added only the transition words and 
phrases which are enclosed in square brackets. As will be 
evident, the ‘Last Words’ embody the individual conclusions 
of different students.” 

The actual examination was as follows : 

“A PHILOSOPHICAL SYMPOSIUM 

An evening in early June, 1922. 

“place: Lake Waban. 

Six girls in two canoes, having finished their supper, are 
drifting quietly down the lake with the canoes held together 
to make 'possible a general conversation. After a brief dis- 
cussion of the final examination in philosophy which they 
have all taken in the morning, they find that they have among 
themselves representatives of most of the types of philosoph- 
ical theory studied during the course. Freed from the neces- 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


213 


sities of preparing for examination they fall to discussing 
on their merits these various conceptions of the nature of 
reality. 


Della., who holds a dualistic view. 

Bernice, a Berkleian idealist. 

Matilda, known as Mattie, the materialist. 

Polly, a pluralistic personalist {one who holds a personal- 
istic conception of nature). 

Abbie, the upholder of absolute personalism. 

‘^DIRECTIONS: 

(1) Complete the speeches suggested below in accordance 

with the assigned characters of the speakers. 

(2) Add a last word from the speaker who most nearly 

represents your own opinion. 

“Delia. To get down to concrete facts, Bernice, just what do 
you think this canoe really is, and just what do you think 
you really are? 

“Bernice. As for your last question, I think as you do that 
I am a, unique, thinking, unified self who has ideas and 
who knows herself through her ideas. Not that I am my 
ideas for I am indeed a distinct individuality but I have 
ideas and I know myself through them. 

The canoe in my opinion surely exists but only in our 
minds and the mind of God. 

“Delia. I don^t agree with you at all. I am sure the canoe 
is material reality existing independent of my mind. I 
believe with you, of course, that its color is not a char- 
acter that belongs unchangeably to it, in fact, that the 
weight and texture and all the secondary qualities are 
not distinctly its characters. But I do maintain that its 
extension and motion exist entirely independent of me 
or any one else and that they are stable and unchanging 
characteristics of the canoe. "What is more, nine people 
out of ten will agree with me. Why do you insist on be- 
ing differ entf 

“Bernice. It is only carrying on in the same line a part of 
your own description of the canoe. You admit that color, 
for instance, does not exist unperceived. I insist that the 



214 TRADITIONAL EXAjVIINATIONS AND NEW-TYPE TESTS 

same thing is true of extension^ for surely, when I am 
up in a tower looking down, men look no more than pig- 
mies five or six inches high. But when I am down on the 
ground among them I realize that they are all five or 
six feet high and that their size varies with my point 
of view. Yet if they were composed of matter existing en- 
tirely independent of my mind it would not he in my 
power to see them other than they really are, six feet 
or six inches high. But, since extension and motion like 
the other qualities exist only in the mind, I can under- 
stand the world I live in. 

You remember the man Tvho talked in Billings last 
year on Einstein? Well, it’s something like that. It all 
comes back to idea and the mind. 

“Mattie. Nonsense! What the canoe is, is matter. We know 
it is wooden, painted green and about seven feet long. 
It is made up of atoms reducible to electrons. The canoe 
is matter. How can you believe anything else as a self- 
respecting Physics major? I’m sure my Ohem. makes 
me believe in matter. And what you are, and what I am, 
and what every one else is, is matter. We are conscious 
because of our brains which function — oh, you know the 
processes we learned in Freshman Zoo and by laboratory 
experiments. (I’ve dissected a cat’s brain myself!) The 
brain is one of the bodily organs. Its particular func- 
tion is to give consciousness which is only one of the 
properties of matter. 

“Delia. But I never could see why you reject Descarteses 
hypothesis of two kinds of reality, thinking substance 
and matter, a substance having the quality of extension, 
and independent of mind as far as its existence is con- 
cerned, though it may be influenced by mind. 

“Mattie. That view is perfectly impossible. For one thing it 
cannot possibly be squared with the theory of evolution, 
for we can trace the growth of the mind or brain, which 
you call spiritual, from the earliest form of life, even 
from the most primitive forms of plant life, all the way 
up to man. And all through this long evolutionary scale 
we are dealing with only one substance, namely, material 
reality, or matter, which is governed by the universal 
law of conservation of matter and of energy. Since aU 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


215 


the way through we are dealing with only one substance, 
you have no right to introduce a different kind of sub- 
stance, soul or spirit, at any place in the scale. Moreover, 
you simply couldn’t do this, without increasing the sum 
total of energy in the world, and this is contrary to the 
universal law of the conservation of energy. 

^‘Bernice. I am with you, Mattie, in saying that there can 
he only one kind of reality in the universe. But I am sure 
that Abbie and Polly will agree with me that this one 
kind is not matter but spirit [with] its ideas. Your argu- 
ment from evolution does not in the least prove what 
you say it does {although of course we accept that theory 
in some form or other) for your proof [is] that starting 
out with matter you end up with matter. That is not the 
point — ^how do you know you started out with matter? I 
don’t believe you did. The world [is] a complex of ideas 
existing in God’s mind and became revealed to finite 
spirits. Of course, I expect you to say that there were 
no finite souls for this evolving universe to be revealed 
to. But you can not be sure. There may have been sub- 
human souls or minds present. 

Mattie. But how can you seriously say, as you apparently 
do, that this canoe is nothing hut a lot of sensations that 
you happen to he having just nowf How does this mere 
idea of a canoe keep you out of the water? 

^‘Bernice. ThaVs a better joke than argument. Both the canoe 
and the water do exist, but they exist as ideas. Be con- 
sistent, Mattie! 

Delia. That is all very well for your perceptions of the 
canoe and the water, hut there must he a cause for those 
perceptions, and that cause is matter. 

'^Abbeb. Why matter? 

‘‘Delia. You sound as if you had never heard of Descarteses 
argument that the cause of perception must he matter. 

“Bernice. DonH quote that obsolete argument to me. Berke- 
ley answered it long since. Matter as Descartes described 
it simply can not he the caitse of anything at all because 
matter by definition is inactive, and to be the cause of 
something requires activity, equally by definition. And 
matter certainly can not be the cause of mind because 
they are essentially different according to Delia’s defini- 



216 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


tion, and the cause must have all the qualities of the 
effect. 

^^IIattib. Well, if ever there was an obsolete and archaic ar- 
gument it is certainly yours. Matter m any even halfway 
modern type of thought about it can perfectly well be a 
cause. Consider the atom, infinite millions of it, each one 
vibrating at an almost incalculable rate of speed, com- 
bining in infinite variety to form all the substances of the 
known world, and stop saying that matter can not be 
the cause of your feeble minds, or of anything else, 

^‘Bernice. Vnever for one moment denied that my percep- 
tions have a cause, I simply insisted, and continue to in- 
sist, that the cause can not be matter. Your bringing in 
of this theory of oscillating atoms to bolster up a bad 
case doesnH help it at all for the vibration of atoms is 
really, after all, only a matter of motion and extension, 
and Berkeley proved that motion and extension are sen- 
sible qualities existing only as they are perceived, so 
there you are back to the mental again. You can’t get 
away from it. [Moreover] if you will carefully consider 
what you conceive of when you think of activity, you 
will realize that the only form of real activity in the 
world is volition. 

Polly. What you say about the nature of physical objects, 
Bernice, may do well enough for the canoe, but you will 
never convince me or any other lover of nature that this 
willow, trailing its leaves in the water, exists only as an 
idea without any reality of its own, I, for one, believe 
with Leibniz and some contemporary philosophers that 
a tree is a sub-human self, or part of a sub-human self, 
or a collection of sub-human selves with perceptions as 
indistinct as our own in our sleepiest moments, or when 
we faint or take an anesthetic. (You know that feeling of 
^I’m just barely here^ that you get just as you drop off.) 
These selves are in constant flux and transformation, a 
sort of metamorphosis. Of course Mattie couldn’t per- 
form an experiment to prove this, but it is conceivable 
that [our tree-percepts] are signs of such sub-human 
selves just as our bodies are ideas that are signs of our 
selves. 

Mattie. I donH see that you have any right to that theory, 



217 


EXAMPLES OP TRADITIONAL EXAMINATIONS 

Polly, If you and Bernice really mean what you say I 
think you should come to the conclusion that nothing at 
all exists except your own ideas, 

‘‘Bernice. Quite the contrary, I know there is a real physical 
world because I perceive it. And because I knoiv that, I 
come to the conclusion that God exists, for many of my 
perceptions come to me against my will, and since they 
must have a cause that cause is outside myself. Then if 
they aren’t my ideas, they must be ideas in some one 
else’s mind. But the world of physical objects is so in- 
tricate, so stupendous and so marvelous that I have to 
conclude that it exists in the mind of an all-wise, all- 
powerful, all-knowing and entirely good being who is 
God. Also I consider that I have some reason to believe 
that other finite selves like myself exist, 

“Abbib. If you followed out your argument to its logical con- 
clusion you would agree with me that God is not simply 
the creator of the universe but is an Absolute Self in- 
cluding within himself all other selves and spirits in the 
universe, and [that he is] something more — a personal- 
ity beyond the sum of the selves that are parts of him. 
Because, donH you see that it is ultimately impossible 
for God to cause in us ideas as you say he does, unless 
there were some vital connection between him and us. 
According to what you say there would have [to] be an 
association between his mind and ours so intimate that 
consciousness is shared. This gives us the conception of 
ourselves as a part of God — ^the whole universe being ul- 
timately one soul and one spirit in which we are all con- 
tained. This all includmg Absolute Spirit shares all our 
experiences because he is more — ^not less than we are. 
^^PoLLY. It is a pretty theory, Abbie, the chief trouble with 
it being that I don’t see how you can reconcile individ- 
uality with the [doctrine] that we are all part of one 
great, all-including self. If we are all parts of the same 
whole how can we hold such radically Afferent opinions, 
and be at such variance among ourselves ? Besides, how 
can many selves make one self? And furthermore if the 
Absolute Self shares our experiences would He not be 
limited too? 

Abbib, You never would see reason along thai line. But if 



218 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


you don^t accept my conclusion, how can you ever recon- 
cile the conception of God that you and Bernice seem to 
agree upon with the awful evils of sin and suffering in 
the worldf 

Bernice. Evil as we see it is not incompatible with the real- 
ity of Ood, for God does not reaUy cause evil, we are the 
cause of it — ^we have been given by God the limited free- 
dom of will — and are thus far responsible for the evils 
we commit by that freedom. He does not cause the evil. 
Furthermore evil is a part of the whole which is good. A 
thing may be evil when it is considered alone — but con- 
sidered in connection and relation with the whole which 
is good — it is good. 

^^Abbie. That won’t do, Bernice, You are arguing in a vicious 
circle, for you have just argued the existence of God 
from the perfection of the physical world, and now you 
turn completely around and argue the perfection of the 
physical world from the character of God. You can’t use 
each to prove the other. 

‘‘ Mattie. But both of you start, just as Delia does, with a 
pure assumption, namely, that there is spiritual, incor- 
poreal reality in the universe. If you assume thod to be- 
gin with, of course you can prove anything, 

**Detja. 1 don’t assume that at all, I hold that I immediately 
know the existence of myself and of my modes of con- 
sciousness, for in the very act of trying to doubt my own 
existence I discover that there must be some one to do 
the doubting. 


‘‘last words 

“Mattie. How do you know that the self you are conscious of 
isn’t matter — a form of itf It is the one thing you know 
first of all — ^yes? But why assume it to be spiritual? 
Tour self is just that part of the material world you 
know first of all, and most surely, and by means of which 
you learn to know the rest. 

“Delia. I believe in spirit because I am directly conscious of 
myself as a spirit and because all of my little religious 
experience demands belief in spirit. Belief in the mate- 
rial reality of things about me seems much simpler than 
to believe in a [wholly] mental world. Belief in matter 



EXAIVIPLES OF TRADITIONAL EXAI^IINATIONS 


219 


is [in fact] just a simple and sensible belief after study- 
ing science and evolution; [whereas] saying that evolu- 
tion is a series of ideas somebody might have had seems 
to me to hedge rather awkwardly. As for the inter- 
relatedness of matter and spirit it seems very conceivable 
to me. My soul does affect my body. How else would I be 
different from a machine — ^yet I assuredly am not a ma- 
chine because I am conscious of myself. In the world 
about us we see unlike things interact and just so I hold 
that mind and body affect each other. 

[To sum up :] I believe in matter because of the ra- 
tional simplicity and reasonableness of the idea. I believe 
in spirit because I can not doubt that I am spirit and I 
feel the existence of other spirits and God, not only nec- 
essary but reasonable. I believe in the relation between 
spirit and matter because it explains so many phenom- 
ena. 

Bernice. But I do not see, Delia, how you can assume the 
existence of material substance when the only proof you 
have of its existence is your perceptual, sensory [expe- 
rience]. It seems to me much more reasonable to conceive 
the universe as made up of spirits and ideas, the spirits 
the subjects and the ideas the objects of the perceptions. 
God is conceived as the infinite spirit, in whose mind ex- 
ist all the ideas which give rise to my perceptions. 

Polly. We are all beating around the bush fearfully, girls. 
I really don’t believe any one has yet proved anything 
without a single flaw throughout. Of course, we want to 
find out the truth if we can. But it seems to me that 
everyone, so far, has formed her idea and then searched 
for arguments to prove it. Personally, I love to think that 
everything is a self: that Lake Waban feels us drifting 
on her top, that the willow over there really knows that 
the sun is out and shines down on her leaves. That’s 
what I always have liked to think. So when Leibniz en- 
tered my life, I grabbed him, and settled back with a 
satisfied feeling. But what we all ought to do, if we really 
want to get at the truth, is to come, absolutely unbiased, 
to the threshold of philosophical arguments and follow 
out a completely logical line of thought. I do believe 

.. now [that] the most. adequate interpretation of the .uni- 



220 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

verse is as consisting of separate, distinct selves, not as 
Leibniz holds, simple and ^without windows,’ or rela- 
tions to others, but interrelated, forming a great com- 
mnnity but not all parts of an Absolute Self. God is the 
ideal Self, not the cause of other selves or of ideas as 
some Pluralists hold, but the perfect Self toward which 
we strive. Evil in the world is due, not to God, but to 
conflicts of finite selves. Each self has freedom of will 
and choice, and each can do his part in making the com- 
munity better and approaching the ideal : God. The phys- 
ical world is either one or numerous selves, it does not 
vitally matter which. With this theory may be correlated 
[the] idea of morality as acting and willing to further 
the ‘complete desired experience’ of the ‘Universal Com- 
munity.’ 

“Abbib. Well, Polly, I think your God might have known 
better than to give men free wills to mess up the world 
with. And how can [your] selves be both related and dis- 
tinct? It seems to me that one of these characters must 
exclude the other. [In fact] all your systems either evade 
or omit questions which are not easily solved. You, Delia, 
have failed to explain the relation between the corporeal 
and the spiritual created ‘substances.’ On one occasion 
you have said that matter and spirit are absolutely un- 
related, and, on another, that the body and the soul are 
related through the pineal gland in the brain and that 
the soul only [affects] the directions of the movements 
of the body. This is inconsistent [and] it is also, I be- 
lieve, untrue. For I am conscious of willing movement in 
varying degrees of amount. For instance, my volition to 
do a backjack from the spring-board into the water is 
quantitively greater than my volition to pull down the 
window-shade to keep the sun out of my room. 

Mattie’s system is evidently very plausible in itself, but 
it is rendered ridiculous [by] not being able to prove that, 
its fundamental assertion, ‘matter exists,’ is true. Bernice 
[does not] adequately explain the relationship of God 
and finite selves, how God gives us our ideas; and she 
argues in circles tr:^g to make the sin and evil of the 
world compatible with her idea of God. My theory ade- 
quately explains: the relationship of God and finite sdves 



EXAMPLES OP TEADITIONAL EXAMINATIONS 


221 


— ^how God gives us our ideas [for] all the lesser selves 
are parts of the Absolute. [And] I have a legitimate 
ai^ument for the existence of God, -whereas [Bernice’s] 
falls flat at once, if the universe can be proved otherwise 
than good. Then, too, my theory gives a better solution 
of the problem of evil in the world, for according to it 
God experiences all that finite selves experience and one 
can not conceive of his willing for Himself anything that 
He did not consider good. This makes the Absolute, or 
Gk)d, the author of evil — which may shock some people 
but a careful consideration shows that this conception 
of a suffering God is the only alternative to a dnalistic 
theism if one faces the problem of evil squarely and tries 
to solve it.” 

5. Traditional examination questions selected from 
those actually used by public-school teachers. Few of 
the examination questions previously given in this 
chapter can really be said to have been taken from ex- 
aminations as ordinarily given in our public elemen- 
tary and high schools. It seems in place, therefore, to 
include a number of questions actually used by public- 
school teachers. Those given below represent a selec- 
tion of distinctly good questions made according to the 
judgment of a single person, but one who had made a 
study of examinations (59, pp. 62-71). They were 
chosen from a rather large number of final examina- 
tions given in elementary and high schools in the 
State of Illinois. It will be seen that the questions 
given call for both factual memory and higher thought 
processes, such as explanation, discussion, organiza- 
tion, discrimination, etc. 

“HISTORY (Seventh grade) 

“1. "What was the purpose of Columbus’ voyage and its 
result! 

“2. What did ten of the following explore or discover; 



222 TEABITIONAL EXA3^IINATIONS AND 3STEW-TYPE TESTS 

Cabot, Balboa, Ponce de Leon, Magellan, Coronado, De >Soto, 
Drake, Hudson, Cartier, Champlain, LaSalle.^ 

“3. What people settled Jamestown, Virginia? Discuss one 
of the following topics in connection with Virginia history: 
The Starving Time, Individual Ownership, Tobacco Eaising. 

*‘4. TeU by what class of people and why was each of the 
following states settled : Maryland, Carolina, Georgia. 

^*5. Tell the story of William Penn, naming the colony he 
founded. 

^'6. Why did the Pilgrims come to America? W/here did 
they land? Write of their first winter in America and their 
relations with the Indians. 

^‘7. Tell what happened in 1492, 1607, 1619, 1620, from 
1519 to 1522. 

“8. For what nation and on what errand did Joliet come 
to the Illinois country? Why did Marquette come with him? 

‘‘9. Tell the story of Starved Rock. What tribes of Indians 
were connected with it and how? 

^*10. Write an item of historical interest about each of 
five: Tonti, Stuyvesant, Oglethorpe, Baltimore, Bacon, Brad- 
ford. 

^‘11. Write a short paragraph about two of the following: 

1. Illinois pioneers. 

2. Illinois rangers, 

3. Block houses. 

4. Keel boats. 

^‘HISTORY (Seventh grade) 

*‘l. WTiat were the results of the Revolutionary War? 

^^2. Explain in what ways Congress was weak under the 
Articles of Confederation. 

“3. Who was Lafayette? What did he do for American 
liberty? Why? 

^^4. What was the Ordinance of 1787? 

‘‘5. A convention of delegates from the states was called 
to meet in Philadelphia in May, 1787. Why? 

^‘6. Who was the first President of the United States under 
the Constitution ? When and where was he inaugurated ? 

“7. Name and tell how important inventions have helped 
the progress of the United States. 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


223 


^‘8. Why was the purchase of Louisiana an important 
event for the United States^ 

^‘9. What was the result of the War of 1812? 

‘ ‘ 10. Tell something of the work of the Humanitarians and 
the establishment of the free elementary schools. 

‘‘HISTORY (Eighth grade) 

“1. Discuss five powers or duties of Congress. 

“2. Name the oflSeers of the President's cabinet and a 
duty of each. 

“3. Explain the need of a survey system. Make a diagram 
showing base line, principal meridian, township lines, range 
of townships. Locate Twp. 2 N. R. 3 E. of P. M. 

“4. How were ten of the following connected with the 
Civil War: Stonewall Jackson, Major Anderson, Robert E. 
Lee, Jefferson Davis, McClellan, Hooker, Grant, Sherman, 
Emancipation Proclamation, Hammering Campaign, Gettys- 
burg, and Appomattox Court House. 

“5. Tell the location, time, inventor, and importance of 
the Atlantic Cable. 

“6. Tell of two good results of the Civil Service Reform. 

“7- Name and discuss two famous laws we have studied 
this semester. 

“8. Write a brief paragraph discussing the importance of 
the Pan-American Congresses. 

“9. Give the time, place, purpose and importance of an 
exposition studied this semester. 

“10. State two causes and two results of the Spanish- 
American War. 

“11. Write a statement about each in connection with the 
World War: Autocracy; “der Tag'^; submarine; Lusitania; 
armistice. 


“HISTORY (High school) 

“1. Discuss the work of Spain in exploration — ^naming five 
important explorers. 

“2. State the Mercantile Theory of trade and explain its 
effects during the American colonial period. 

' “3. State five defects in the Articles of Confederation. How 
were these defects remedied in the Constitution? 



224 THABITIOlSrAL EXAMIKATIOKS AOT) NEW-TYPE TESTS 


‘‘4. "Wliat was the Northwest Ordinance of 1787? Why 
important? 

“5. Explain Alexander Hamilton’s policy on the United 
States Debt. 

“6. Give the history of the Nullification Controversy of 
1828-33. 

“7. Give the history of the election of 1823. 

“8, Identify the following: Gallatin; Oglethorpe; DeWitt 
Clinton ; Thomas Paine ; Stephen Decatur. 

“9. Explain three important results of the War of 1812. 

‘‘10. Discuss the political platform and policies of Presi- 
dent Jefferson. 

“ANCIENT HISTORY (High school) 

“1. (a) Name and explain the sources of historical in- 
formation. 

(b) Name each of the Oriental nations in the order of 
their development. State what was done by each 
for civilization. 

“2, (a) What were the causes of Greek colonization? 
What relation did the Greek colony have to the 
mother city? 

(b) Locate the chief centers of colonization and state 
for what each was famous. 

^‘3. (a) Trace out the history of the ancient Hebrew peo- 
ple and explain their service to humanity. 

(b) Describe the government and customs of the 
Spartans. 

“4. (a) Explain the origin, growth, and effect of the De- 
lian Confederacy on Greek history. Show how it 
changed into the Athenian Empire. 

(b) Describe Athens in the time of Pericles. 

(c) Describe the intellectual greatness of Athens in 
the time of Pericles. 

“5. (a) Identify the following men and account for their 
greatness- 

1. Alexander 3. Aristides 

2. Themistocles 4. Plato 

5. Socrates. 



EXAMPLES OF TRADITIONAL EXAMINATIONS 225 

(b) Give an account of the invasion of Greece by 
Xerxes during the Persian Wars. 

Name important battles and tell in detail about 
one. 

“6. (a) What did Philip of Macedon accomplish for Mace- 
donia? 

(b) Trace the march of Alexander against the Per- 
sians. 

“ 7 . (a) Why was Europe better fitted than Asia to de- 
velop the highest civilization? 

(b) What mountain systems of Europe are not off- 
shoots from the central mass of the Alps? 


“GRAMMAR (Eighth grade) 

“1. Of what value is a good vocabulary? 

“2. How can a person acquire a good command of words? 

“3. What is an antonym? Write five words and give their 
antonyms. 

“4. Define synonym. Write five words and give their syn- 
onyms. 

“5. Define verb. A verb phrase. 

“6. Write a stanza of the ‘Star Spangled Banner.’ Punc- 
tuate correctly and underline the verbs and verb phrases. 

“7. Define transitive verb. A direct object. Write a sen- 
tence containing a transitive verb and a direct object. 

“8. Name the eight parts of speech. 

“9. A boy is flying a kite. 

A crow is flying over the cornfield. 

Are the verbs in the above sentences transitive or intran- 
sitive? Why? 

“10. Use the verbs 'lie’ and 'lay,’ 'sit’ and ‘set,’ correctly 
in sentences. 

“11. Classify the nouns and verbs as to number in the fol- 
lowing: 

(a) They came (e) The house will be built 

(b) We come (f) The horses were running 

(c) We have come (g) They have been seen 

,(d) We had come (h) We saw 



226 TRADITIONAL EXAlVflNATIONS AND NEW-TYPE TESTS 


‘^12. What is a participle? Underline and give the gram- 
matical construction of the participle in the following sen- 
tence : 

Crossing the street, I lost my hat. 

'^13. What are the distinguishing marks of a verbal noun? 
^^14. Give the construction of the verbal nouns in the fol- 
lowing sentences : 

(a) To obey is a cardinal virtue. 

(b) Most boys like to play basketball. 

(c) Playing baseball is hard work. 

(d) I enjoy hearing pupils read. 

*‘15. Name and define the tenses. Tabulate the tenses of 
the verb ‘call.’ 

“ENGLISH (High school, 1st year) 

“1. (a) In comparison with other languages is English old 
or new? 

(b) Why is the study of Latin important to us? 

(c) Give an example of a word derived from Latin 
and explain its parts. 

(d) Explain by illustration the use of a prefix, 

(e) Define literally: irregular, international. 

*‘2. (a) Explain the value of good pronunciation. 

(b) List five words that you have been mispronounc- 
ing. 

(c) Punctuation aids one in what way? 

(d) Illustrate the use of one punctuation mark. 

(e) State and illustrate the simple rule of spelling in 
regard to ei and ie. 

*‘3. (a) Name the eight parts of speech. Illustrate. 

(b) Why should one make a good choice of words in 
writing or speaking ? 

(e) Write an exclamatory sentence. 

(d) Give the plurals of: datum, radius, lady, alumna, 
monkey, index, oasis, cargo, solo, volcano. 

(e) Give the feminine of: abbot, hero, wizard, sir, lord. 
“4. (a) How would you distinguish poetry from prose ? 

(b) Give an example of rhyme. 

(c) Name three poetic qualities of ‘The Ancient Mari- 
ner.’ Illustrate. 



EXAMPLES OP TRADITIONAL EXAMINATIONS 227 

(d) Quote your favorite stanza from the poem, 'The 
Ancient Mariner/ 

(e) Name two descriptive passages from 'The Vision 
of Sir Launfal/ 

'5. (a) What is the purpose of the drama? 

(b) Name and give the dates of our greatest English 
dramatist. 

(c) Is The Merchant of Venice tragedy or comedy? 
Why? 

(d) Name the four stories which make up the plot. 

(e) Quote three passages from the play. 

(f ) VHiat is the elimax of the play ? Explain why. 

(g) Give setting of the story. Its source. 

'6. (a) Describe the theater of Shakespeare time, or 
characterize Shylock carefully, illustrating your 
points. 


"ENGLISH (High school) 

"1. (a) Discuss the work of three colonial prose writers, 
(b) Discuss the work of two colonial poets. 

"2, (a) What were the general tendencies of the literature 
of the Eevolution? 

(b) Discuss the works of two writers who were closely 
connected with governmental affairs. 

"3. Who was the first American novelist? Tell about his 
works and characteristics as a writer. 

"4. (a) Give five important facts concernmg the life of 
Irving. 

(b) Write in outline form a classification of Irving’s 
works. 

"5. Give a detailed account of the life of your favorite 
American poet. 

"6. (a) Name three striking characteristics of the poetry 
of each of the six great American poets. 

(b) Name two of the best poems of each. 

"7. (a) Give five facts concerning the life of Poe. 

(b) Give ten striking characteristics of his work. 

"8. (a) Discuss the prose of Emerson, Lowell, and Holmes, 
(b) Quote two epigrams from Emerson’s essays. 



228 TKADITIONMi IXAMNATIONS AND NEW-TYPE TESTS 


“9. Write a paragraph, on the subject: ‘Thoreau’s Individ- 
ualism.’ 

“10. For -what were the following noted: Walt Whitman, 
John Motley, Joel Barlow, Timothy Dwight, Francis Park- 
man, Thomas Bailey Aldrich, William Dean Howells, Bret 
Harte, Bayard Taylor, Sidney Lanier 1 
“11. Name the authors of the following: Tales of a Way- 
side Inn, ‘Commemoration Ode,’ The Prairie, The Prince of 
Parthia, ‘Laus Deo,’ My Study Windows, ‘The Last Leaf,’ 
‘Tampa Robins,’ The Blithedale Romance, ‘Early Spring.’ 


“CIVICS (Eighth grade) 

“1. How does the Child Labor Law govern the employ- 
ment of children in Illinois? 

“2. What provisions are found in the United States Con- 
stitution in regard to the right to vote ? 

“3. What are the voting qualifications in Illinois? 

“4. Explain how the President of the United States is 
elected. 

“5. How does a postman secure his position? What are 
some of the necessary qualifications? 

“6. Write the Preamble to the Constitution of the United 
States. 

“7. In what particulars were the Articles of Confederation 
faulty? 

“8. When did the Constitution of the United States go into 
operation ? 

“9. What is the purpose of a writ of Habeas Corpus? 

“10. How may a bill be passed over the President’s veto? 

“11. State the duties of the County Sheriff; the County 
Superintendent of Schools. 

“12. How many directors are there in school districts of 
less than 1000 inhabitants? 

“13. What constitutes the Illinois Teachers’ TgTflTniniTig 
Board? WThat are its duties? 

“14. State briefly the duties of the County Clerk. 

“15. WTiat is minority or proportional representation? 
How is it used in Illinois? 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


229 


^TIVICS (Eighth grade) 

What is a democracy? Name two. Compare our gOTern- 
ment to a ball team; explain an aristocracy through a ball 
team. 

‘^2. What does majority rule mean? Was it right for ns to 
resist Britain in 1775? Why? Is a revolution ever dangerous? 

^‘3. Name five rights of American citizens. Name five duties 
of American citizens. 

^‘4. Where did we get our ideas of liberty? What was the 
Magna Charta? 

^'5. Explain home rule in the United States. Who was 
responsible for the good or bad government? 

'‘6. Name the three branches of our government, and the 
representative of each. 

^‘7. Who may become president? What great law tells us 
this? Who is commander-in-chief of the army and navy? 

‘*8. When and where did the Constitution Convention 
meet ? Who made the Constitution a power ? 

‘‘9. Write the Preamble to the Constitution. 

^‘10. Name five of the president's secretaries and tell who 
fiUs the ofifices- 

*^11. Who makes treaties and issues passports? WTio has 
charge of the mints ? What is the difference between civil and 
political rights? When do our political rights begin? 

‘^CIVICS (High school, 1st year) 

“1. What is an * unwritten constitution'? Give examples to 
show that we have one. 

^^2. Name our colonial possessions and tell how each is gov- 
erned. 

“3. Give qualifications and length of term for the Presi- 
dent, a Senator, and a Kepresentative. 

^‘4. What is meant by gerrymander, pocket veto, quorum, 
pacifist, recall, neutral, arbitration? 

‘‘5. Trace a biU through the process of becoming a law. 

^'6. Of what does the Supreme Court consist? and what are 
its duties ? 



230 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 
‘‘CIVICS (High school) 

“1. What differences did the framers of the constitution 
intend to create in the two houses ^ 

“2. What dangers are inherent in popular government? 

“3. ^The federal government is one of limited power but 
within its own field it is supreme/ Discuss. 

^‘4. What are the functions of the grand jury? Of the petit 
jury? What is a ‘hung’ jury? 

^‘5. Discuss the origin of political parties in the United 
States. State the forces at work and the political leaders. Show 
how the different political parties, when in power, affect the 
commerce and general prosperity of the nation. 

“6. Should we abandon our present electoral system? Give 
your reasons. 

‘‘7. What is the difference between obeying a law and obey- 
ing a person ? 

“8. Discuss metallic and paper money in the United States, 
stating the backing of each. Explain the European money 
market today on the basis of the above explanation. 

^‘9. What is the work of the National Committee? 

^^10. What has the Washington Conference really accom- 
plished? 

‘^GEOGRAPHY (Seventh grade) 

“1. (a) Describe the formation of our continent. 

(b) Name the two great mountain systems and their 
smaller groups, 

*‘2. (a) Describe the size, shape and position of North 
America. 

(b) What was the extent of the Great Ice Sheet? 

^‘3. (a) Write an interesting paragraph about the Eskimos, 
(b) Name the New England States and give their capi- 
tals. 

“4. Describe the surface features, climate, rainfall and 
products of the Middle Atlantic States. 

“5. (a) Describe the mining of coal in the Middle Atlantic 
States. 

(b) Name the by-products of petroleum. 

“6. Name and describe the three chief industries of the 
Southern States. 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


231 


'PHYSICS (High school) 

'^1. What is light? Cause of eclipse of moon? Show by 
drawing. Draw eclipse of sun. 

‘‘2. Describe the rainbow. Show the formation in drawing. 
Show by drawing why the bow is curved. 

‘‘3. WTiat is heat? Temperature? Why does the boiling 
point vary? 

^‘4. What is the heat of fusion? Vaporization? How do 
these facts affect our life ? 

“5. Describe a heating plant. Draw. (Either steam, water 
or air.) 

^‘6. What is probable nature of electrification? Theory of 
Leyden jar? 

‘‘7. Describe a good open circuit battery and a good closed 
circuit battery. 

‘‘8. Describe what you saw in X-ray. 

“9. How does a motor work? Describe the arc light. Incan- 
descent. 

‘‘10. Explain induction coil. Give practical uses of the coil. 
Give uses of transformer. 

“11. Describe either phone or telegraph in full. 

“12. Write eight points either in favor of or against the 
study of physics in high school. 


“ZOOLOGY (High school) 

“1. Name at least eight branches of the animal kingdom 
and an example of each class. 

“2. Discuss the classification of animals as to method used 
and basis for. 

“3- Explain how the amphibia stand between the fishes and 
the reptiles. 

“4. Give four illustrations showing how insects are adapted 
to their environment. 

“5. Why are the porifera a step higher than the protozoa? 

“6. Name an animal possessing one of the following: 

1. Alternation of generation. 3. Complete metamorphosis. 

2. Bilateral symmetry. 4. Budding. 

5. B^eneration of lost parts. 



232 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


'^7. Give axL example to illustrate the struggle for existence 
and tell how the law of the survival of the fittest came to be 
established. 

^‘8. Why are the primates of such great importance? 

“9. What animal would you prefer to watch and study? 
Why? 

‘ ‘ 10. Explain how zoology helps you to realize the following 
objectives: (1) Health. (2) Vocation. (3) Use of leisure time. 

^^11. What is your opinion as to the theory of evolution ? 

'‘ARITB3IETIC (Eighth grade) 

^‘1. How much must I pay for U. S. Liberty Bonds at 
92, brokerage 1%, in order to have an annual income of $600? 

‘‘2. How many blotters 6 inches long and 3% inches wide 
can be cut without waste from a sheet of blotting paper 2 feet 
long and 14 inches wide ? 

^‘3. What mathematical facts do the following numbers 
represent : 231, 7.92, 7000, 5280, 360, 62%, 31%, 32, 60, 16. 

^^4. A baseball diamond, or infield, of regulation size for 
men is 90 ft. square. How long is a straight throw from first 
base to third ? 

Extract the cube root of 5832 and 148877. 

“6, Draw a figure to represent a section of land, (a) Num- 
ber correctly the sections, (b) In a smaller drawing show the 
N.E.% of the S.B.% of see. 16. 

Define mensuration, plane surface, rectangle, trape- 
zoid, parallelogram. 

How find the area of a parallelogram? How many 
square feet in a building lot 125 feet long and 50 feet wide ? 

^^9. How many acres of land in a road 10 miles long and 4 
rods wide ? What is the land in this road worth if land sells 
at $300 per acre ? 

'*10. Define circle, diameter, radius, circumference. 

'*11. A boy measured the distance around a tree and found 
it to be 6% feet. How thick is the tree, correct to the nearest 
inch, where he made the measurement ? 

'*12. What is the lateral surface of a cylinder 15 inches 
high and 10 inches in diameter? What is its entire surface? 

"13. A silo (cylindrical) is 12 feet in diameter and is ^ed 
to a depth of 18 feet. How many cubic feet of silage does it 
contain? 



EXAMPLES OF TRADITIONAL EXAMINATIONS 


233 


“14. Define sphere, area of sphere, volume of a sphere. 
Considering the earth a sphere whose radius is 4000 miles, 
find the area of the earth’s surface. Its volume.” 

It is not to be understood that eacb set of questions 
collected under one heading is recommended as an ex- 
amination to be given in the ordinary length of time 
allotted to that purpose. In some cases the sets are 
evidently so intended, whereas in others the sugges- 
tion is made that only eight, or, in other instances, 
ten, questions be answered. 

6. Traditional examination exercises based upon 
quotations. In addition to the various types and vari- 
eties of traditional examination exercises previously 
illustrated, there are several of which no examples or 
even descriptions have been given. Among these are 
exercises based upon the use of quotations which are 
to be compared, contrasted, criticized, supplemented, 
or in some other way dealt with. Perhaps the field of 
subject-matter in which such exercises can be most 
often used with profit is that of literature, but also 
in aU the sciences, including social science, and in some 
other subjects, there is place for them. 

One may, for example, call attention to the follow- 
ing two portions of the “Vision of Sir Launfal”: 

“For this man so foul and bent of stature, 

Basped harshly against his better nature. 

And seemed the one blot on the summer mom — 

So he tossed him a piece of gold in scorn. 


“And Sir Launfal said, *I behold in thee 
An image of him who died on the tree. 

Thou a£o hast had thy crown of thorns; 

Thou £dso hast had Ihe world’s buffets and scorns; 



2S4 TEADITIONAL E3LAMINATIONS AND NEW-TYPE TESTS 

“ ‘And to thy life were not denied 
The •vronnds in the hands and feet and side. 

Mild Mary’s Son, acknowledge me; 

Behold, through him I give to thee!’ ” 

These quotations may then, he followed by such exer- 
d.ses as: 

Compare Sir Launfal’s attitude toward the leper before 
and after his search for the Grail. 

Contrast the way in which the leper appeared to Sir Laun- 
fal before and after his search for the Holy Grail. 

Another possibility is illustrated by the following 
questions which may be asked in coimeetion with the 
first of the two quotations just given. Bather than call- 
ing for comparison or contrast of one selection or 
quotation with another these call for pupil attitudes 
and reasons therefor. 

Was not Sir Launfal right in scorning the leper because 
the latter had a loathsome disease ? Give reasons for your an- 
swer. 

Do you think Sir Launfal’s attitude toward the leper as 
shown in this quotation was that of a Christian? Why? 

Another example which calls for the contrast of 
quotations may be cited from the field of history. For 
example, the following two statements may be found 
in a certain text: 

“Hayne claimed that a state could legally defy the laws of 
Congress if it thought them iinconstitutionaL” 

“Webster claimed that for a state to oppose the laws of 
Congress would be treason.” 

The teacher may well call attention to these two state- 
ments, perhaps writing them on the board, and ask 
pupils to contrast them with regard to the probable 



EXAMPLES OF TRADITIOXAL EXAMINATIONS 235 

effect upon the future of our country if they were uni- 
versally accepted and acted upon. 

Another possibility in dealing with quotations is to 
ask pupils to criticize them, usually both ways. That 

is, pupils are to be asked to give arguments or rea- 
sons in support of the statement made and also against 

it. For example, the teacher may cite the sentence: 

“Jackson claimed that to the victors belonged the spoils.” 

Another statement which may be used for similar pur- 
poses, in this case taken from a physics textbook, is : 

“The metric system should be adopted by the United 
States.” 

In the case of either of these, pupils may be asked for 
general criticisms, both favorable and unfavorable, or 
more specific questions may be asked. For example, in 
connection with the first one a teacher may use such 
questions as: 

"What justification had Jackson for making the above 
claim f 

To what extent would such a procedure increase or decrease 
efSciency in the work of the govermnent? 

What good and what bad results would follow from the gen- 
eral application of this principle in political affairs? 

In connection with the statement concerning the metric 
system, suitable questions are : 

What benefits would result from the use of the metric sys- 
tem f 

Would the difficulty of getting people to adopt it be greater 
than these benefits? 

What undesirable results would follow its adoption? 

Has it proven satisfactory or unsatisfactory in coimtries in 
which it has been adopted and used? 



236 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

"Would its adoption be more or less difficult in the United 
States than in other countries which have adopted it? "Why? 

Instead of giving quotations to be criticized both pro 
and con it is sometimes desirable to give quotations 
which the pupils are told contain false statements, 
and to call for criticism of such a sort as to prove that 
the statements are false, and ordinarily also to amend 
them so that they will be true. Such statements as 
those cited below may be used as just suggested. 

“The eily of "Washington owes its present population pri- 
marily to its numerous industries.” 

“There is no reason why young children should not drink 
reasonable amounts of tea and coffee.” 

7. Summary. A study of the kinds of questions 
actually used by teachers and of the types of mental 
activity carried on by pupils indicates that there are 
a considerable number of more or less distinct types 
of mental activity and implies that a sufficient variety 
of questions should be used to test these different 
types. Illustrations are given of exercises testing each 
of twenty types of mental activity. Following these 
are examples of discussion questions in literature, of 
an unusual variety of completion essay examination, 
of questions selected from those actually used by 
teachers in a number of upper-grade and high-school 
subjects, and of exercises based upon quotations. 



CHAPTER IX 


THE CONSTRUCTION AND USE OF NEW-TYPE 
EXAMINATIONS 

I. Constructing new-t3^e examinations. Teachers 
who have not had a considerable amount of tr aining 
in making new-type examinations, either thr(m|yi 
courses carried in institutions of higher learnin|||P 
through experience and other tr aining in aetual^erv- 
ice, should not rush headlong into their use for reCT- 
lar class-rjOijm purposes. Such teachers should stml^ 
one type„at a time, beginning with the single-answer 
or some other of the simplest and easiest kinds to con- 
struct, and after some practice and experience with 
this pass on to a second type, then to a third, and so 
on, until by the end of a year or two thdy have prob- 
ably made use of all of the more common types suit- 
able for use in the subject-matter which they happen 
to be teaching. Indeed it is probable that even teachers 
who have made a careful study of the construction and 
use of such tests will generally not find time enough to 
construct a complete set of such tests during the first 
year in which they teach particular courses or subjects. 
It is much better to make only a moderate use of the 
new examination for a year or two, but to be sure that 
such tests of this type as are employed are satisfac- 
tory, than to employ it more extensively but in a hasty 
and careless maimer. 


237 



238 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


Before taking up the discussion and illustration of 
the numerous types and varieties of the new examina- 
tion, it seems appropriate to consider a number of 
general principles of construction which may he ap- 
plied to aU or at least several different varieties 
thereof. Some of these are more or less repetitions of 
principles and suggestions given previously in con- 
nection mth examinations in general, but, since they 
apply with particular force to new-type tests, it seems 
appropriate to state them here also. 

It is highly essential that new-type tests be pre- 
ceded by explicit directions as to just what the pupils 

• to do, or, in other words, as to just how they are 
ecord their responses. If a particular group of 
pupils has taken the same type of test often enough 
^become thoroughly familiar with it, it may be that 
^ directions for later tests of exactly the same sort 
can well be decidedly shortened or even omitted with- 
out danger of misunderstanding. On the other hand, 
if the pupils are decidedly unfamiliar with the kind' of 
exercise to be used, the instructions should not merely 
be full and clear but should also usually include ex- 
amples. For younger children these examples should 
often include one or two to which the responses are 
already recorded in the desired manner, and one or 
two more to which the pupils are to respond xmder the 
direction of the teacher, so as to be sure that they 
know what is to be done. For older pupils, at least 
those in high school, it is ordinarily unnecessary to 
make use of examples of the latter type. One impor- 
tant item sometimes neglected in the directions is a 
definite statement as to just where on the paper the 
pupils’ responses are to be placed, whether along the 



CONSTBtJCTlOH OP NEW-ITYPE ESAMIKA^IOKS 239 

left-hand margin of the paper, the right-hand margin, 
or elsewhere. It is sometimes recommended tihat the 
exact amount of time to be allowed be aimounced, but 
there does not appear to be any important reason why 
this shotild be done. It is, however, probably desir- 
able to give the pupils some idea of the amount of time 
so that they will know whether they are expected to 
complete the test or not. If the exact amount of time 
is known and a clock is visible, there is danger that 
it will prove ,a source of distraction. Indeed, those pu- 
pils who carry watches are sometimes similarly dis- 
tracted by them if no clock is in sight. 

In the case of a new examination which consists of 
two or more parts each containing exercises of a dif- 
ferent type, certain general directions should come at 
the very first. These should usually be decidedly brief, 
rarely more than two or three sentences in length 
and frequently only one, but should in a general way 
indicate the purpose of the examination and what is 
to be done. In many instances it is satisfactory for 
these to be given orally by the teacher. 

In aU types which involve statements, such as the 
true-false and yes-no, many varieties of the multiple- 
answer, the completion, the incorrect statement, and 
others, care should be exercised in the wording of the 
statements so that ambiguity is avoided. If thoughtful 
attention is not given to this matter it very frequently 
happens that some pupils discover one or more pos- 
sible interpretations of an exercise entirely different 
from that intended by the teacher, and thus make the 
determination of how their responses should be scored 
more difficult. 

Marks determined by pupils’ responses to new-type 



240 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

tests should he based upon a fairly large number of 
items.^ If tests are given rather frequently and the 
results combined in determining marks, it is satis- 
factory to have them so short as to contain only a 
relatively few items, perhaps twenty or twenty-five. 
However, if a mark of any importance is to be based 
on a single new-type test, the number of items should 
ordinarily be at least one hundred. Two hundred are 
still better. 

Ordinarily care should be exercised to select items 
so that they accomplish one or the other of two pur- 
poses, or, in other words, are of one or the other of two 
kinds. They should be chosen so as to include prac- 
tically all the most important points covered in the 
subject-matter to be tested, or so as to constitute a 
sampling of a much larger number of points of less 
average importance. A complete testing program 
should include many items of each sort. 

The items included in any test should be so arranged 
that there is no regular sequence of answers. For ex- 
ample, the true and false statements in a true-false 
test should be arranged in random or chance order, as 
also should the position of the correct answer or an- 
swers in a multiple-answer test, of the items in each 
of the lists in matching exercises, and so forth. It is 
frequently desirable to employ some more or less for- 
mal means of insuring this result. In the case of an 
alternative test, for example, a coin may be tossed to 
decide the matter, one side standing for a true state- 

1 The term item Is frequently used as here to refer to the smallest unit or part 
of a test -which calls for a distinct answer or pupil response. Thus a single direct- 
recall or yes-no question, a true-false statement, a -word to be defined, a term <to 
be connected with the proper part of a figure or diagram, is an item. In some 
cases an item is the same as an exercise. In others a number of items are included 
in one exercise. 



CONSTRUCTION OP NEW-TYPE EXAMINATIONS 241 

ment or a question 'wHch should be answered afSrma- 
tively, and the other for the opposite. Another means 
of accomplishing the same result is to have as many 
slips as there are statements or questions, half marked 
in one way and half in another, and draw them one at 
a time to determine the order. For a multiple-response 
test in which the correct answer to each exercise is 
to be selected from among four, the figures 1, 2, 3, 
and 4 may be written on similar pieces of paper or 
cardboard, these turned face down and mixed up. For 
each item one slip should be drawn and the number 
upon it determine the position of the correct answer 
within the group of four. After each drawing the slip 
drawn should be returned and mixed up with the others 
before the next drawing. In matching tests alphabet- 
ical arrangement of a column is frequently random in 
so far as the basis of matching is concerned. 

Although the principle of random arrangement just 
stated is generally accepted, there are a few persons 
who have argued for some other order. Some of these 
have maintained that items should he arranged in or- 
der from the most to the least important so that the 
pupils who do not finish will have the opportunity of 
responding to those items of the highest importance. 
Others have maintained that the order of difficulty be- 
ginning with the easiest is best so that pupils may 
gain confidence by starting in with items of compara- 
tively little difficulty and thus gradually work up to 
the more difficult ones. There is some merit in both 
of these contentions, but it seems scarcely sufficient 
to outweigh the desirability of random arrangement. 
Especially for younger pupils it is probably desirable 
to place three or four rather easy items first, but even 
in that case it seems scarcely worth, wlple .to. t^e the 



242 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

time and trouble necessary to attempt to arrange all 
in order of difficnlty. In many instances, bo-wever, ar- 
rangement may be in order of importance or of in- 
creasing dificTilty and yet be in random order so far 
as affirmative and negative responses are concerned. 
Unless tbe time limit is short enough so that few if 
any pupils will reach the end of the test, it is probably 
not worth while to make any attempt to follow the 
order of importance. 

Another suggestion which has been made concern- 
ing the arrangement of items in a test is that they be 
in the same order as they have been taken up in class 
recitation or discussion. It is probable that this ar- 
rangement makes a test slightly easier and it is pos- 
sible that it serves to give a somewhat better and more 
unified review of the course as a whole. The latter is 
the diief advantage claimed for it by its advocates. 
In general expert opinion is opposed to this order, 
however, largely because there is too great likelihood 
of undesirable connections between a particular item 
and the ones immediately preceding and following it. 
On the whole it does not seem best to recommend that 
this sequence be followed, though if care is taken to 
make items independent of one another, it probably 
does not involve any undesirable consequences. 

In the case of an examination containing items or 
exercises of several types, all those of each type should 
be gathered together under a single heading and with 
a single set of directions. In other words, the so-called 
“omnibus”® type of arrangement sometimes em- 

2 An omnibus test is one in wMch vazious kinds of tasks or exercises are mixed 
together in regular or, more often, irregnlar order instead of being grouped so 
that all of one kind are together. Sudi a test may begin with a true-false state- 
ment followed by a multiple-choice exercise, then an analogy, a copipletion state- 
ment, another true-false one, and so on. 



CONSTRUCTION OF NEW-TYPE EXAMINATIONS 243 

ployed in intelligence tests is not recommended for 
those in the school subjects. 

The wording of the statements and the suggested 
answers, where given, should be such that the correct 
answers are not too evident and the incorrect ones not 
too absurd. Items and exercises should be so con- 
structed that even though the pupils unacquainted 
with the subject-matter are of high intelligence and 
generally well informed, they will not be able to an- 
swer them correctly, whereas tiiose who have mastered 
the subject-matter will be able to do so regardless of 
their general intelligence and of their knowledge in 
other fields. Moreover, the exercises should not be such 
that a person who knows other points than the one to 
be tested by a particular item, even though these 
points be in the field covered, can by a process of 
elimination or otherwise easily arrive at the correct 
answer thereto. For example, if such a multiple-answer 
exercise as the following were given, “The command- 
ing general of the Confederate armies during the lat- 
ter years of the Civil War was (Grant, Lee, Sherman, 
McClellan),” a pupil who knew nothing about the va- 
rious Confederate generals might easily single out Lee 
as the correct answer because he knew that the other 
three named were all Union generals. If, however, the 
names suggested were Johnston, Lee, Jackson and 
Longstreet, the exercise would probably accomplish its 
desired purpose. 

A test should be of such a degree of difficulty that 
no or practically no perfect scores or zero scores will 
be made. To accomplish this some items easy enough 
for practically aU members of the class should be in- 
cluded, others hard enough that only the very best 
pupils can answer them, and others of difficulty inter- 



244 TRADITIONAL ESAMISTATIONS AND NEW-TYPE TESTS 

mediate between the two. Sometimes this matter can 
be controlled either partly or wholly by making the 
time long enough that all pupils can respond to a fair 
number of the exercises in the test, but short enough 
that none can complete the whole test. Justifiable ex- 
ceptions to this general rule may be made in the case 
of tests covering material on which absolute mastery 
has been set as a definite goal, or, in other words, over 
what may be called minimum essentials. 

The principle of variety should be observed in con- 
structing and using new-type examinations. In other 
words, at least several varieties should be used in the 
course of any term or semester program of testing. In- 
deed the same procedure should be followed in the 
case of a single rather long test. It is generally better, 
for example, to give a test consisting of fifty multiple- 
choice items, fifty completion items, and fifty alterna- 
tive items, than one containing one hundred and fifty 
items of any one of the three types. This is true both 
because by so doing better rounded and more com- 
plete measures are secured, and because tests are thus 
rendered less monotonous and more interesting to pu- 
pils. 

Teachers should from time to time jot down items 
which they think suitable and when a test is to be given 
look through these items and select as many as are 
needed. The writer has found it most convenient and 
satisfactory to record all such items in the form of 
positive statements and then in constructing a test to 
change them into the form desired. 

In constructing new-type tests it will be found that 
certain varieties and forms fit particular material 
much better than do others. Indeed it not infrequently 
occurs that a teacher finds it impossible to construct 



CONSTKUCTION OF NEW-TYPE EXAMINATIONS 245 

a satisfactory test of the form originally in mind over 
a certain body of subject-matter, and that therefore 
some other form must be employed. Some of the types, 
such as the ordinary forms of the single-answer, al- 
ternative, multiple-answer, and completion tests, have 
a wide range of use and can be adapted to almost any 
material, whereas others have a much narrower sphere 
of usefulness. 

Whenever possible without too great expenditure of 
time, money, and energy, new-type examinations should 
be multigraphed so that each pupil may have a copy 
in his hands. Indeed in many cases this is not only 
preferable but absolutely necessary if a certain form 
is to be used at all. If multigraphing is impossible or 
impracticable, however, several of the types may be 
employed without doing so. Teachers may read recall 
exercises, true-false statements, or yes-no questions, 
and have the pupils record their responses upon ordi- 
nary blank sheets of paper. The same practice may 
also be followed, though less satisfactorily, with sev- 
eral other t3Tpes among which are definitions, enumera- 
tion, certain varieties of association, abbreviations 
and formulae, perhaps even multiple-response, and 
others. Also it is possible to place any of the types 
upon the blackboard, keeping the test covered until the 
time it is to be used, but because of the amount of 
labor required on the part of the teacher, as well as 
the board space necessary, this is rarely satisfactory 
and is not at all recommended as a common practice. 
It is also possible in any case to dictate the material 
complete to the pupils and have them copy it before 
attempting to respond, but the time cost of this is too 
great to render its frequent use advisable. The best 
that the teacher who does not have at hand facilities 



24R TRADITIONAL EXAJflNATIONS AND NEW-TYPE TESTS 


for multigrapHiig can do is to make use of those types 
which can be given orally while pupils record only 
their responses and occasionally to write one on the 
board or dictate it in full. 

2. Scoring new-type examinations and handling the 
results. The form of the test blank or paper and the 
directions to the pupils for recording their responses 
should be devised so as to render scoring as easy and 
rapid a process as possible. For example, true-false 
statements or yes-no questions should be answered by 
* placing the proper responses in front of each, so that 
the responses form a straight column near the edge 
of the paper, rather than by placing them at the ends 
of the statements or questions, in which case they are 
irregularly placed. 

In many cases scoring can be greatly facilitated by 
the use of prepared slips or other material. For ex- 
ample, if a true-false test has been given with instruc- 
tions to write “true” or “false” in front of each state- 
ment, a slip of the same length as the sheet containing 
the statements may be prepared by writing the correct 
responses upon the slip, keeping them spaced so that 
they correspond with the location of the pupils’ re- 
sponses. By laying such a slip upon a pupil’s paper, 
one can run down the column and determine his score 
very quickly. Slips such as this may be used in all 
cases in which the answers are recorded in straight 
columns and perhaps in some others. For most tests 
in which the answers are not so recorded, a sheet of 
paper or cardboard may be prepared by cutting small 
slots or openings so placed that each falls over the 
correct answer to one exercise. For varieties to which 
neither of these scoring aids is adapted, a teacher can 



CONSTEUCTION OF NIW-TYPE EXAMINATIONS 247 

usually with the exercise of a little ingenuity prepare 
devices which are appropriate and economical. 

Pupils’ scores upon new-type tests should be given 
in terms of points, ordinarily one point for each item. 
After they have been given and tabulated in this form 
those for a whole class or group may if desired be 
turned into letters or other school marks according to 
the method recommended and described in Chapter IV., 

It is especially valuable for teachers inexperienced\ 
in the use of new-type examinations, but also worth 
while for those well versed therein, to tabulate and 
preserve the results of the tests which they give. By 
so doing points not made clear by the directions can 
be discovered and then later remedied so as to im- 
prove future tests. Likewise items or exercises which 
are not clear and those which are too hard, too easy, 
or in some other way undesirable, can be discovered 
and either modified or eliminated. By gathering exer- 
cises and items and by making a critical study of 
those used as suggested above, a teacher can accumu- 
late a large number of satisfactory items and exercises 
from which future tests may be constructed wUh a 
■miTn'mum expenditure of time and thought. Such a col- 
lection of material should never be allowed to become 
complete and static even though the teacher handles 
the same course from year to year, but should always 
receive additions and modifications. 

There is no reason why the ordinary class-room 
teacher who makes use of new-type tests cannot make 
them practically the equal of commercially available 
standardized tests ex;cept as regards the numbers of 
pupils to whom they are given, but it is scarcely worth 
the time and labor required to do so. A considerable 



248 TEADITIONAL EXAMIlfATIONS AOT) NEW-TYPE TESTS 

amoxmt of time is necessary to gather data and in- 
terpret them so as to show validity, reliability, and so 
forth. In some cases it is worth while for a group of 
teachers or a supervisory official to do so, but for the 
regular class-room teacher more profitable use of the 
tme required can ordinarily be .found. 

3. The selection of the most appropriate types for 
class-room use. The principle of variety in the use of 
new-type examinations has already been referred to. 
That is to say, any rather long single examination or 
any series of short ones, should include tests of sev- 
eral varieties. The selection and frequency of use of the 
different varieties of objective tests, however, ought 
not to be left to chance, but should receive careful at- 
tention. A number of different., .criteria which apply 
in this connection will be discussed briefly in the next 
few pages. In the succeeding chapters, which deal with 
the main types of the new examination, some attention 
will be devoted to the advantages and disadvantages 
of each type, but it seems appropriate to discuss the 
matter here from a more comparative point of view. 

As was brought out in the discussion of the relative 
merits and demerits of traditional and new-type ex- 
aminations, the question of validity 's of prime im- 
portance in eoimection with any test. Several of the 
studies referred to in that connection and also a num- , 
her of others present data on the comparative validity ^ 
of different types of objective tests. The chief diffi- 
culty in arriving at definite conclusions lies in the 
selection of satisfactory criterion measures. The cri- 
teria most often employed have been teachers ’ marks 
or estimates, or some combination of scores from sev- 
eral kinds of tests. Although such studies as those of 
Bonder (42), May (52), and Wood (99) have yielded 



CONSTEUCTIOIT OP NEW-TYPE EXAMINATIONS 249 

somewhat different results, the first indicating that 
true-false tests have higher validity than several other 
types, the second favoring multiple-answer tests, and 
the third completion tests, the differences are in gen- 
eral too small to be very significant. Therefore it can 
be said tha,t the majority of published investigations 
agree fairly well in supporting the conclusion that 
there is little difference in the validity of the three 
types of the new examination just mentioned. Few of 
the studies made have included other types, but the re- 
sults of those which have done so tend to indicate 
that the single-answer, analogies, and matching types 
also possess about the same validity as the true-false, 
multiple-answer, and completion. It is very probable 
that for particular bodies of subject-matter and for 
special purposes certain forms of exercises yield more 
valid results than do others, but in general it appears 
that at least all of the more commonly used forms of 
the new examination differ so little in regard to valid- 
ity that it need not be considered as a factor in select- 
ing j^e type to be used. 

.,,^ 3 n<jther important characteristic of a good examina- 
tion is r eliability . The reported data upon this point 
are more extensive and probably more satisfactory 
than those having to do with validity, but tend to much 
the same conclusion. Ruch (71, pp; 111-113, and 74, 
pp. 70-71), who has made extensive studies of the 
matter, reports data which indicate that the reliabil- 
ity of single-answer and multiple-answer tests is some- 
what greater than that of true-false ones. Hammond 
(33), who studied the same three types, foimd little 
difference between the three, the single-answer type, 
however, ranking slightly lower than the true-false, 
and both a little below the multiple-answer. Toops 



250 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


(87) found that for the same number of items the 
reliability of the single-answer t3T)e was highest, that 
of the multiple-answer second, and of the true-false 
lowest, but that for the same amount of working time 
on each there was little difference. Hopkins (36) gives 
figures which indicate that the multiple-answer and 
completion types are slightly more reliable than the 
true-false. An unpublished study with which the writer 
is acquainted indicates that matching tests have about 
the same reliability as multiple-answer ones. On the 
whole, the evidence indicates that as between the 
single-answer, the multiple-answer, the completion and 
the matching types, at least, there is practically no 
choice in so far as reliability is concerned. The true- 
false type is probably slightly less reliable than the 
four just mentioned, though the difference is certainly 
not very great if computations are based upon the 
same amount of working time rather than upon the 
same number of exercises or items. 

The question of objectivity of scoring does not play 
an important part in determining the choice of types 
of the new examination because in practically all types 
it is relatively high. As will be seen from examining 
the examples of tests given in the succeeding chap- 
ters, a few varieties of certain chief types are con- 
siderably less objective than others, or at least require 
considerably more skill and forethought to make them 
as objective. It is, therefore, recommended that these 
-varieties be used only when there happen to be other 
reasons therefor which outweigh this disadvantage. 
In a few cases they are needed to test particular types 
of mental activity which should not be overlooked. 

Ease of scoring should be given consideration in de- 
ciding upon the form of tests to be used. In general 



CONSTRUCTION OF NEW-TYPE EXAMINATIONS 251 

those tests are easiest to score wMch are so arranged 
that pupils’ responses form a straight column on each 
page. Since it is possible to provide for recording re- 
sponses in this form on almost all varieties of new- 
type tests, this principle affords little aid in the selec- 
tion of the variety to use. 

Occasional attempts have been made to argue in 
favor of some one or other type of the new examina- 
tion on the basis of its relative difficulty as compared 
with that of the others. Such evidence as is available 
.agrees fairly well with what one would expect from a 
thoughtful study of the different forms to the effect 
that the degree of difficulty tends to be in inverse ratio 
to the amount of assistance or suggestion provided. 
Pupils who would be unable to recall the correct re- 
sponses to single-answer exercises covering the same 
points can frequently select the right answer from 
among a group of suggested answers in a multiple- 
response test or connect the proper items when given 
in a matching test. It is not apparent, however, that 
the matter of whether or not a particular type is more 
or less difficult than another has any important bear- 
ing upon the selection of the t3q)e to be used. The dif- 
ferences in the difficulty of the several t3q)es are so 
much less than the variations in difficulty possible 
within the same type as to be practically negligible in 
comparison therewith. 

An important consideration in the selection of tests 
by teachers who do not have ready access to facilities 
for producing duplicate copies is the matter of whether 
or not they can be given orally, by the use of the black- 
board, or in some other method which does not involve 
duplication. It has already been suggested that certain 
forms lend themselves to such use more easily than 



252 TEADITIONAL EXAMETATIONS AAT) NEW-TYPE TESTS 

do others. Further comments as to the availability of 
the different variet ies of tests wiU be found from time 
to time in the''suSeeedmg chapter. 

A general c riticism has sometimes been brought 
against all types of the new examination which include 
a list of suggested answers to the effect that they dis^ 
courage mtiative on the part of pupils and en coura^ 
d epSHen^ and' also guessing. Those who advance this 
criticism have sometimes urged that the single-answer 
fl-nd other tests which provide very little or no help 
at all be used almost, if not entirely, exclusively. It 
seems to the writer, however, that this is not a suffi- 
ciently strong argument to justify the avoidance of 
multiple-^swer, matc hing , and other tests which do 
cdntainT a group of possible answers, although it is 
an argument against their exc lusive use . Pupils should 
become accustomed to solving^rdblems and answering 
questions of both types, that is, those which provide 
no hints or suggestions as to the correct responses and 
also those which do provide such help. In child life 
outside the school and also in adult life both types of 
situations are encountered and it is desirable that 
training for both be given as a part of school proce- 
dure. 

Because the alternative type of test, especially when 
in true-false form, tends to be more confusing ® than 
most, or perhaps aU, of the other types, and still more 
because it is frequently difficult to convince pupils of 
the justice of the proper method of scoring, this type 
should receive somewhat less use than the other chief 
ones. It should, however, have a place, since the abil- 

3Xt \s not intended to imply that the danger of confusing pupils as to what 
they know is very serious if alternative tests are employed as they should be, but 
only that it is '.somewhat greater than , with oth^ types. ^ 



CONSTRUCTION OF NEW-TYPE EXAMINATIONS 253 

ity to choose between two responses or courses of ac- 
tion is one that should he developed. 

4. Summary. Teachers who are just beginning to use 
new-type examinations should make a careful study of 
their construction and use and then enter upon a com- 
plete program of employing them rather slowly. Such 
tests should be preceded by explicit directions which 
in many cases will include examples of the exercises 
which follow. Both the selection and the wording of 
exercises or items should receive critical attention. The 
arrangement of exercises or items should ordinarily be 
random and sometimes also in order of difficulty be- 
ginning with the easiest. The incorrect answers in 
those types which contain suggested responses should 
not be too evidently wrong. The degree of difficulty 
of each test should be such that practically no zero 
or perfect scores will be made. In any long examink- 
tion or series of short tests a number of types of ex- 
ercises should be employed. It is almost always desir- 
able if possible to place a copy in the hands of each 
pupil. For scoring new-type tests strips of cardboard 
and other mechanical devices should be prepared so 
to save as much time and labor as possible. PupUal’ 
scores as well as the exercises and items used in tests 
should be preserved and thus a collection of satis- 
factory testing material accumulated. By preserving 
scores the difficulty of this material can be known. Al- 
though v alidit y, reliability, obj ectivity of s corinsf . and 
so forth, are^ impoffSit qualities in a good test7 they 
appear to be so nearly the same for the different types 
of the new examination that they scarcely need be con- 
sidered in sel ecting th e..types to be employed. In for- 
mulating a program of new-type tests one may there- 
fore select me types to be employed with little regard 



254 TSABmoKAL EXAMlN-ATlOsrs AJSD NEW-TtPE TESTS 

to most of the criteria of a satisfactory test, rather 
endeavoring to employ those best adapted to the sub- 
ject-matter and kind of mental activity desired. The 
one important exception to this statement is that al- 
ternative, particnlarly true-false, tests should be used 
somewhat less frequently than each of the other main 
types. 



CHAPTER X 


SINGLE-ANSWER OB RECALL TESTS 

I. General discussion. The simplest and probably 
the most often used in the past of the types of the 
new examination is the single-answer or recall variety. 
Each exercise of this type consists of a direct ques- 
tion, or its equivalent, to which the answer is a single 
expression, usually a word. Such exercises have long 
been used by teachers in written tests and examina- 
tions, but oiy recently have they been more or less 
set apart as a particular variety and especially dis- 
tinguished from those to which the answers are sen- 
tences, paragraphs, or other longer units. It is also 
understood that a single-answer exercise be of such 
a nature that there can be no, or practically no, doubt 
as to the correctness of any given answer. In many 
cases the questions are such that only one correct an- 
swer is possible and, therefore, any other answer 
should be marked wrong. La other instances, however, 
there may be two or more answers any one of which 
is satisfactory, but still no doubt concerning the cor- 
rectness of any given answer. For example, to such a 
question as, “Who was commander-in-chief of the 
Union forces during the last few months of the Civil 
War?” the only possible correct answer is “Grant.” 
If, however, the question, “Who was one general who 
commanded the Union Army in its attempts to capture 
Richmond?” were asked, any one of several possible 

265 



256 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

answers, sueli as “Grant,” “McClellan,” “Burn- 
side,” and so forth, would be correct and all others 
incorrect. 

This type of exercise calls for responses, almost al- 
ways items of information which pupils have had the 
opportunity to memorize, without giving any sugges- 
tions or aids as to what they are. Therefore it not 
only permits initiative, but even develops it more than 
do most of the other forms of the new examination. 
The lack of suggested aid renders it more difficult for 
pupils than the multiple-answer and other types which 
offer several possibilities or in some other way afford 
more or less aid. Also there is practically no chance of 
guessing the correct answer, a fact which further 
tends to cause fewer correct responses. Since single- 
answer exercises are more difficult than most other 
forms of the new examination, more time is required 
for pupils to complete the same number of items or 
exercises than in many of the others. Likewise the time 
for scoring is usually longer although this is not true 
if the questions are such that each has only one pos- 
sible correct answer. Because of the greater difficulty 
of formulating such questions than appears in connec- 
tion with some other types of exercises, this variety 
is not quite as highly objective as some of the others. 
With a reasonable amount of care, however, the ob- 
jectivity of scoring can be made sufficiently high to be 
reasonably or even entirely satisfactory. 

One of the great advantages of this form of exer- 
cise is that it can be adapted to almost any subject or 
portion of a subject. Although its chief function, sim- 
ilar to that of all forms of the new exanaination, is 
to measure knowledge of memorized facts, it can be 
used to do more than this in many instances. Appli- 



SINGLE-ANSWER OR RECALL TESTS 


237 


cations and comparisons of facts, for example, can be 
tested by formulating questions wMcb have not been 
directly answered in assigned material or by the 
teacher. Single-answer exercises can frequently be 
used to replace those asking for definitions, explana- 
tions, and so forth. For example, instead of asking 
such a question as, “What is an exponent!” or, “What 
is a subjective complement!” one may ask, “What 
name is given to the small figure, letter, or other ex- 
pression, written at the upper right-hand side of a 
quantity to show how many times it is to be used 
as a factor!” or, “What name is given to a noun used 
to complete the meaning of a copulative verb!” The 
form used, that is, the. ordinary question, is familiar 
to pupils, therefore the instructions needed are re- 
duced to a minimum and illustrative exercises and re- 
sponses often unnecessary. Because of this familiarity 
there is little danger that pupils will misunderstand or 
attempt to respond otherwise than as desired. 

One very practical advantage of single-answer tests 
is that they are about the easiest and most satisfactory 
of the new-type examinations for oral use. A teacher 
can read a single question, perhaps re-read it, allow 
a reasonable amount of time for pupils to think of and 
record the answer, then continue to a second question, 
and so on. Indeed, this procedure is not new to many 
teachers and pupils. 

The ordinary, and generally most satisfactory, 
method of scoring a recall test is to allow one point 
credit for each exercise or item correctly answered. It 
is true that the difficulty and importance of the various 
exercises and items may differ considerably, but, as 
has been-shown in Chapter IV, • it is rarely desirable to 
tske suqh "differences into consideration in determin- 



258 TBABITIONAL EXAMIITATIONS AOT) NEW-TYPE TESTS 

ing the number of points credit allowed each correct 
answer. In the two or three varieties of recall tests 
in which each exercise or item calls for more than 
one response, it is probably best to allow one point 
for each response rather than one for each exercise 
or item. For example, if a pupil is asked to give three 
examples of each of a number of things called for, he 
should receive one point credit for each correct ex- 
ample regardless of whether or not he gives a complete 
and correct group of three. 

On the whole it is recommended that the single- 
answer variety of new-type tests be one of the three 
or four most often and widely used. Because of its 
familiar form those who are just beginning to work 
with the new examination can well employ it at first 
until greater familiarity with other varieties thereof 
is acquired. 

2. Ordinary single-answer tests. In its ordinary form 
the single-answer test may be illustrated by the fol- 
lowing set of questions in the field of manual training. 
Almost all of the questions are such that only one pos- 
sible answer is correct although in a few cases there 
are others which should be accepted. 

Directions: Each of the following questions can and should 
be correctly answered by a single word or number, Write the 
correct answer to each immediately after the question. Be 
sure that you write only one word or number after each.^ 

1. What kind of a hapdsaw should be used to cut with the 

grain? .•x 

2. What Mnd of a plane should be used to smooth the end 
of a small board ? 

3. What sort of a bit is used to bore large holes ? 

1 Examples illustrating how the puplh are to record their responses will not 
be indtided in the directions for all tests but only frequently enough to suggest 
how they ‘should be worded when they are necessary. Ab was suggested earlier 



SINGLE-ANSWEE OR RECALL TESTS 


259 


4. Wliat kind of a file is used in sharpening sawsf - 

5. What should be used to close the pores of open-grain 
wood? . - 

6. With what tool should a chisel be driven? 

7. In what is shellac commonly dissolved? 

8. What wood is generally used, for the handles of axes and 

hatchets? , , / 

9. What material is ordinarily used to cement metals to- 
gether? 

10. What number on the shank of a bit indicates that it 
will bore a hole one-fourth of an inch in diameter? 

Considerable time is saved in the scoring of single- 
answer tests if provision is made for having pupils 
record all of their answers in straight columns. The 
most convenient place for these columns of answers is 
at the left-hand edge of the test sheets. Blank lines of 
uniform and sufficient length may be placed in front 
of the numbers of the various questions or items and 
pupils instructed to place their answers upon these 
lines. The scorer can then write out a list of correct 
answers spaced to correspond to the lines and, by lay- 
ing it alongside of the pupils^ responses, mark the 
latter very rapidly. This arrangement is illustrated by 
a test in bookkeeping. 

Directions : ^ The correct answer to each of the questions be- 
low is a single word or number. In each case in which you 
know, or think you know, the answer, write it upon the blank 
line in front of the question. Do not write more than one word 
or number on each line. 

1. What is the technical name for all 

goods bought to be sold? 

in the text» they are not needed when testing pupils who have become familiar 
with the particular of test being used, but in most cases should be employed 
in the first two or three tests of any particular variety given a group of pupils. 

2 It win be noticed that the directioiis for different tests of exactly the same 
tort are not worded alike. This is done purposely, to suggest various phrasings, 
any one of which is satisfactory. 



260 TRADITIOJh^AL EXAMUSTATIONS AND NEW-TYPE TESTS 


- 2. What term is applied to all money and 
business papers which pass as money? 

. 3. What are values received called? 

« 4. Under what exact heading should a 
note signed by another person which 
you acquire be entered? 

. 5. What is the book which contains the 
accounts "of a business called? 

. 6. What is an itemized statement of 
amount and value of goods on hand 
called? 

^ 7. What are the debts of a business 
called? 

. 8. What must be deducted from the total 
receipts to jSnd the net profit ? 

. 9. What is the process of transferring 
items from the journal to the ledger 
called ? 

,10. What term is applied to all the prop- 
erty belonging to a business ? 

.11. What name is given to a book which 
contains a record of all merchandise 
bought? " 

.12. What term is applied to a systematic 
classification of debits and credits? 

.13. What term is applied to an instru- 
ment which can be passed from hand 
to hand like money? 

.14. How many methods of closing a ledger 
are there? 

.15. What kind of a draft is payable on 
presentation ? 

.16. What must the drawee do to make a 
time draft legally binding? 

17. For what sort of a note are the signers 
responsible both individually and col- 
lectively ? 

18. What name is applied to cashing a 
note before it is due ? 

19. What term is applied to a check which 



SINGLE-AKSWER OR RECALL TESTS 


261 


has been stamped by the bank on 
which drawn to indicate its validity? 

.i 20. What term is applied to loss in vaine 

of property through its ageing or 
wearing out? 

3. Single-answer tests each containing only one exer- 
cise. Under single-answer tests there are included not 
only those in which each exercise consists of a com- 
plete question but also a number of other varieties. In 
one of these a single exercise includes a number of 
items to all of which responses of the same sort are 
to be given and thus may frequently be used as a whole 
test. For example, a list of abbreviations or formulae, 
each of which is to be expanded, may be given. Such 
exercises are illustrated by a list of abbreviations from 
music and one of formulae from chemistry. Since the 
length of the various abbreviations and formulae in- 
cluded are so nearly the same, it is as satisfactory 
from the standpoint of scoring to have the responses 
written after as before them, and, since the former 
is most common^ it is probably to be preferred. 


Directions : Bach, of the following items is an abbreviation 
of a musical term or expression. In each case in which you 
think you know the meaning of the abbreviation, write this 
meaning upon the line following the abbreviation. 


1. M.S. 

2. fE._ 

3. D.C. 

4. f._ 

5. ppp. 

6. M.D. 

7. mp. . 

8. pp. _ 

9. fp. _ 
10. dim. 


11. fff. 

12. l.h 

13. espr. 

14. mf 

15. poco rit. 

16. MJ\I 

17. p. 

18. ad lib. _ 

19. sf. 

20. Rf. 



262 rRADITIOKTAL EXAMINATIONS AND NEW-TYPE TESTS 


Directions; Each item in the list below is a chemical for- 
mula. It is to be expanded, that is, the substance for which it 
stands is to be given. Place the namfe of each substance on the 
line just after its formula. 


1 . 

2 . 

3. 

4. 

5. 

6 . 

7. 

8 . 

9. 

10 . 


NO ^ 

- 11. NaOH^ 


12. KNO. _ _ 

naOl. . 

13. BaSO", _ . 

HgOi. 

14. PeOL 

00. ■ 

15. KOlO, _ 

HNO- 

- 16. SO- : 

H.SO, . _ ^ 

17. NaOl 

OnOl 

18. H0.H,0. . 

P.O, _ 

_ _ 19. H.oo; ; 

HCl 

20. HgOI . . . 


Another slight variation is to reverse the process 
illustrated, that is, to give the complete terms and call 
for their abbreviations or formulae. This is illustrated 
by the following example, also from chemistry. 


Directions : Write the chemical symbol for each of the fol- 
lowing substances on the short line in front of it. 

1. aluminium 6. gold 

2. arsenic 7. helium 

3. bromine 8. iron 

4. carbon 9. lead 

5. copper 10. mercury 

Another possibility is illustrated by a similar test 
wHch may be used in cooMng. It will be seen that 
this names ten articles of food more or less commonly 
dealt with in cooking courses, and requires the pupils 
to state how long each should be cooked. 

Directions; On the blank line in front of each of the items 
below state in figures the number of minutes which it ifiiould 
be cooked. Be sure to state this time in minutes and not in 
hours. 



SINGLE-ANSWBE OE EECALL TESTS 


263 


1. baking-powder bis- 
cuits 

2, boiled meat, per lb. 
S. well-done roast beef, 

per lb. 

4. soft-boiled egg 

5. roast chicken, weight 
5 lbs. 


6. one-egg muffins 

7. devil’s food cake 

8. shortcake 

9. potato chips 
10. Med small fish 


Exercises of a similar sort may be used in many 
other subjects than cooking. Among these is arithme- 
tic, in which it is very often desirable to give a col- 
umn or list of numbers or sometimes other expres- 
sions and ask pupils to perform the same operation 
upon each. They may be asked to add or subtract a 
given amount to each, to multiply or divide by a given 
multiplier or divisor, to find a certain per cent of each, 
to raise each to a certain power, and so forth. The 
example below illustrates such exercises by asking for 
6 per cent of each number. 

Directions ; Find 6 per cent of each of the following num- 
bers. Do not do any of the work on paper except writing the 
answer. The answer in each case should be written on the 
short line immediately after the number. Be sure to place 
the decimal point correctly. 

1. 200. 6. 47.25 

2. 12. 7. 12.5 

3. 115. 8. 4.4 

4. 90.5 9. 1.65 

5. 950. 10. 23.2 


4. So-called “association” single-answer tests. Among 
the forms sometimes called association tests or exer- 
cises is one variety of the single-answer type. This 
consists of a list of words or terms for each of which 



264 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


an associated word or terra is to be given. Ordinarily 
the kind of association is stated and is the same for 
the whole list. Thus a list of dates may be given and 
the event which happened on each called for, a list of 
Latin verbs in the infinitive for each of which a cer- 
tain form is to be given, a list of noons for each of 
which the ploral is reqoired, of books or selections 
whose authors are to be named, and so forth. One pos- 
sibility of this variety may be illustrated by the geog- 
raphy test given below. 

Directions : Give the approximate population of each of the 
cities named below. State populations to the nearest hundred 
thousand if possible. Write the population of each city on the 
blank line immediately preceding the name. 

1. Washington 6. St. Louis 

2. Chicago 7. NewTork 

3. Baltimore 8. Cleveland 

4. Boston 9. Philadelphia 

5. Detroit 10. Pittsburgh 

Another use of the same type is shown by the fol- 
lowing list of Spanish words for each of which the 
opposite is to be given. Unless constructed with un- 
usual care such a test is liable not to be perfectly ob- 
jective since it will almost always be true that some 
pupils will think of some responses which it is diflB- 
cult to rate either as entirely correct opposites or as 
wholly incorrect ones. However, such responses will 
not usually be very numerous. 

Directions : On the blank line after each of the words in the 
list below write the Spanish word which is its direct opposite. 
Do not in any case write more than one word upon a siTi gla 
blank. 



SINGLE-^SWER OR RECALL TESTS 


265 


1. abajo „ 

2. afieion 

3. angosto 

4. capaz «. 

5. corto — 

6. dificil - 

7. digno _ 

8- esposa . 

9. 6xito 

10. guerra . 

Another variety of this same type is that viiich is 
sometimes called the genus-species test or exercise. 
This is most commonly used in the various divisions 
of natural science such as botany, zoology, biology, and 
so forth, but is not at all limited to these. Ordinarily 
it consists of a list of animals, plants, or the like, each 
of which is to be classified as belonging to a certain 
order, family, or other division. 

Directions : On the blank line following each of the names 
of plants in the list below state the family to which the plant 
belongs. 

1. buttercup 

2- cabbage _ 

3. carrot 

4. clover 

5. corn 

A variety of the single-answer test similar to those 
just illustrated but different in that it is somewhat 
less definite and objective is shown by the next two 
examples, one from ancient history and the other from 
geography. In this variety a list of names of men, 
cities or countries, or, indeed, of terms or expressions 
of almost any sort, is given and pupils directed to 
state one important fact associated- with each. The 


6. daisy 

7. elm 

8. hollyhock 

9. laurel 

10. milkweed 


11. hacia 

12. incremento 

13. lento 

14. magro 

15. oeste 

16. pobre 

17. recto 

18. sabio 

19. sano 

20. ultimo 



268 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

nature of the fact is usually specified. For example, as 
in the first illustration, pupils may be asked to mention 
an event in which each of the historical characters 
named in the list took a prominent part. In the second 
example it will be seen that they are instructed to name 
an important product of each of the cities, states and 
countries listed. Other possibOities are characters con- 
temporary with other historical characters, events in 
each of which a prominent participant is to be desig- 
nated, sentences or constructions in either English or 
foreign language for each of which the rule illustrated 
is to be stated, and so on. The scoring of many exer- 
cises of this type is not perfectly objective. It is, for 
example, difficult to determine in some eases whether 
the part played by a person in a certain event was 
prominent or not, whether a certain product deserves 
to be classed as important or not, and so forth. There 
is also a possibility of interpreting somewhat differ- 
ently the significance of the type of relationship or 
connection called for. Thus if a pupil in response to 
the first item of the test immediately below named the 
expedition against Syracuse as the event in which 
Alcibiades played a prominent part, some scorers 
would probably count this correct because Alcibiades 
played a very prominent part in persuading the Athe- 
nians to undertake the expedition and was appointed 
as one of its leaders, whereas others would count it in- 
correct because he really had no connection with the 
expedition after it arrived at Syracuse. It is not prob- 
able, however, that a large proportion of the answers 
given will be of such a sort that their correctness is 
doubtful. 

Directions : Below you will find the names of ten characters 
eonceming whom you have studied in ancient history. After 



SINGLE-ANSWER OR RECALL TESTS 


267 


the name of eaeli character state an important event in which 
he took a prominent part. Do not state more than one event 
for each person named. State the event in as few words as 
possible and do not discuss it at all. 

1. Alcibiades 

2. Aristotle 

3. Brutus 

4. Cato 

5. Clovis 

6. Croesus — , , - .. .. ■■ 

7. Demosthenes , , , , . 

8. Fabius 

9. Hannibal — .. — - , .. 

10, Xerxes 

Directions: On the blank Hne following the name of each 
city or state listed below write the name of one important 
product raised or manufactured there. Name only one product 
for each. 

1. Minneapolis 

2. Kansas City 

3. Holyoke 

4. Gary 

5. Astoria 

5. Definition or description single-answer tests. An- 
other form of single-answer test is the definition or 
description type, also sometimes called the identi- 
fication test. It consists of a series of definitions or 
descriptions in which the term or thing defined or de- 
scribed is not stated, but is to be supplied by the 
pupils. It has sometimes been stated that there is no 
way of testing ability to define or explain through new- 
tjpe tests, but this variety offers at least some assist- 
ance in this respect. It is illustrated by several tests, 
one in business practice or perhaps in commercial 
arithmetic, another in economics, and still a third in 
literature* 


6. Texas 

7. Illinois 

8. 'Wisconsin 

9. Louisiana 

10. Nevada 



268 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


Directions : Each, of the following statements is the defini- 
tion of a term more or less commonly used in business. Write 
the proper term, which is always just one word, upon the 
blank line in front of each statement. 

1. A statement sent to a buyer, usually 

on the first of each month, which in- 
forms him how much is due the person 
or firm sending it. 

2. A form filled out and signed by a per- 
son borrowing money in which he 
agrees to repay the amount borrowed 
at some future time. 

3. The act of the payee in signing his 

name on the back of a check or note. 

4. The profits received on stock owned in 

a company or corporation. 

5. Taxes levied on such property as furni- 
ture, automobiles, live stock, stocks 
and bonds as distinguished from those 
on real estate. 

6. An amount frequently taken off of the 

price when cash is paid. 

7. The amount paid an agent for selling 

something for the owner. 

8. A loan on which payment may be de- 
manded at any time without notice. 

9. The rate of interest paid on a loan 

when no agreement has been made as 
to what it should be. 

10- An insurance policy from which the 

person insured will, if still alive, re- 
ceive the face value at the end of a 
certain time. 

Directions : On the blank line in front of each number you 
are to write the term which is defined by the statement im- 
mediately following the number. The term may consist of one 
word, or of two. 

1. A tax collected on estates as they pass 

to heirs. 

2. The utility of the last unit of a series. 



SIKGLE-ANSWEK OR EECA3X TESTS 


269 


3. The price ^hieh produces the most 
sales. 

4. A partner who furnishes capital but 
has no part in management. 

5. The product of industry used for 
further production. 

6. Division of labor among localities and 
regions. 

7. Capital accumulated to provide for 
wear and tear. 

8. The doctrine of letting each individual 
pursue his own economic advantage. 

9. Legal protection of an author right 
in his book. 

10. Such businesses as street railways, gas 
or light plants, water plants, etc. 


Directions: Each of the short paragraphs below is a brief 
sketch of a well-known author. Write the name of the author 
described by each paragraph on the line to its left. You need 
not give full names, last names will be sufiScient. 

1. An English poet of the nineteenth cen- 
tury, whose wife was also a noted 
writer. He is generally rated second 
among the Victorian poets. Until his 
wife died he spent most of his time in 
Italy, but later lived mostly in Eng- 
land. 

2. The so-called first great American poet. 

He was born near the end of the eight- 
eenth century. After studying law for 
a short time, he took up Uterary work 
and eventually became editor of the 
New York Evening Post. When he died 
he was one of the most loved of Ameri- 
can poets. 

3. A British novelist and poet who lived 

in the latter part of Ihe eighteenth and 
early part of the nineteenth centuries. 



270 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


He wrote a large number of novels, 
most of which deal with Scottish life in 
the middle ages and later. Many of 
these were written under pressure be- 
cause of the large debt which he in- 
curred through a business failure. 

4. An American poet born in Maine near 

the beginning of the nineteenth cen- 
tury. Mter studying abroad he became 
a professor of foreign languages at 
Bowdoin and later at Harvard, but fi- 
nally gave up teaching to devote more 
time to writing. He is frequently called 
the children’s poet. In addition to his 
own writing, he made many transla- 
tions from foreign tongues. 

5. An American poet who was born in 

Cambridge, Massachusetts, in 1809. 
After practicing medicine for a time 
he became a professor in Dartmouth 
and later in Harvard. He is the author 
of ‘^Old Ironsides,” “The Chambered 
Nautilus” and “The Deacon’s Master- 
piece.” He wrote prose as well as po- 
etry, much of the former being pub- 
lished in the Atlantic Monthly. 

6. Single-example tests. Another form of single- 
answer test more or less similar to the last type is that 
in which the responses called for are examples which 
the pupils must give, one for each item. There are few 
if any subjects in which there is not an appropriate 
place for such tests. Thus they may call for examples 
of species of animals or of plants, of types of litera- 
ture or of writers, of various forms, rules and princi- 
ples of grammar, either English or foreign, of kinds of 
algebraic expressions, of geographical features, and 
so on. The two tests given below illustrate this type 
in Spanish grammar and in physiology. 



SINGLE-ANSWER OR RECALL TESTS 


271 


Directions ; Give an example of each of the following forms 
or constrnetions. Place the example for each on the blank line 
immediately under each item. In some eases the example re- 
quired is a single word. In others two or more words or even 
a short sentence are required. 

1. plural of noun ending in stressed vowel 


2. first person singular future active indicative 


3. second person plural imperfect active indicative 


4. verb with two past participles 


5. third person plural present active subjunctive 


6. verb in -er changing stem o to ue 


7. use of adjective as interjection 


8. use of infinitive for English gerund 


9. use of infinitive in place of subjunctive 


10. verb requiring a before an infinitive 


Directions : On the blank line in front of each number you 
are to write the name of an example of what is named after 
the number. This has already been done for No. 1 so as to 
make sure you understand just what you are to do. The word 
after 1 is vertebrate, so ^^man^^ is on the blank line in front 
of 1, since man is a vertebrate. In the same way write the 













272 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

name of some gland on the blank line in front of 2, of some 
voluntary musele in front of 3, and so on. Ready, G-o! 

'mepfv 1. vertebrate 7. digestive juice 

2. gland 8. blood vessel 

3. voluntary muscle 9- germ disease 

4. flat bone 10. excretory organ 

5. ball-and-socket joint 11. nerve center 

6. alimentary organ 

7. Plural or multiple-example tests. This name has 
been chosen to designate exercises which are similar 
to the last variety, the single-example test, hut differ 
in that two or more responses or examples are called 
for in each ease. It will be seen that these would be- 
come ordinary single-example tests if only one re- 
sponse to each item or exercise was called for. On the 
other hand, from another standpoint some of them 
would largely or entirely lose their present character 
if put in such a form as to call for only one response. 
Other classifications have been suggested as applying 
to these tests but none of them seem appropriate and, 
therefore, they have been included here under single- 
answer tests. This type may be subdivided into two 
varieties, sometimes called partial-enumeration and 
complete-enumeration tests. The former calls for two 
or more examples of the thing named, the latter for 
all possible ones. The partial-enumeration .variety is 
illustrated by two examples, one in sewing and the 
other in language. 

Directions: On the four blank lines immediately after and 
below each item in the following list write the names of four 
varieties or examples thereof. 

1. cotton cloth used for dresses 



SINGLE-ANSWER OR RECALL TESTS 


273 


2. stitches made by hand 


3. seams 


4. woolen cloth used for suits and coats 


5. solvents for removing stains 


6. gingham 


7. lace 


8. towel material 


9. textile tests 


^0- material for summer nightgowns 


Directions : There are at least two words which you should 
mow that are synonymous with each of the ten words given 
)elow. Write the two words synonymous with each on the two 
►lank lines which follow it. 

1. arrogant 

2. austere 

3. blackguard 

4. celerity 

5* dabble 

6. despoil 

7. evanesce 

8. gloss 

9. insurrection 

^0. loggerhead : 





274 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

The complete-enumeration variety is illustrated by 
exercises from music and physical geography. It is 
rarely the case that the same number of items is called 
for by the various exercises of such a test, but in 
general the range should not be great, from two or 
three to five or six, or perhaps even eight or ten. Oc- 
casionally longer lists may be called for. 

Directions : Follow the directions given in each of the exer- 
cises below. In each case give all the terms or names called for 
or, if you cannot give aU, as many as you can. Use the blank 
lines below the exercises for the answers. 

1. List all the fractional notes used. 


2. Name the eight syllables in order from low to high. 


3. Name in order the keys from no sharps to five sharps in- 
dusive. 


4. Name the lines of the staff in order from the bottom up. 


5. Give the eight symbols commonly used to designate qual- 
ity of tone and the meanii^ of each. 





SINGLE-AITSWER OR RECALL TESTS 


27S 


6. Name the spaces of the staff in order from the bottom up. 


7. Name in order the keys from no flats to seven flats in- 
clusive. 


8. Name all the string instruments commonly found in a 
large symphony orchestra, such as the Cleveland, Cincinnati 
or St. Louis orchestra. 


9. Name all the wind instruments commonly found in a 
band of twenty-five or thirty members. 


10. Name all the parts commonly represented in a mixed 
chorus of sixteen voices. 


Directions : Name as many varieties, types, or kinds of each 
of the following as you can. Record your answers to each, exer- 
cise upon the blank lines immediately beneath it, using one 
line for each answer. The number of lines under each item 
is the same, but this does not mean that there are that many 
answers to be given to each. In no case, however, are there 
more than eight, 

1. Gases in the atmosphere 







276 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

2. Common kinds of clouds 


3. Climatic zones 


4. Oceans 


5. Types of harbors 


6. Chief classes of rocks 


7. Kinds of transported soils 


8. Types of springs 


9. Forms in which material is transported by streams 


10. Steps in a river’s cycle of erosion. 











SINGLE-ANSWER OR RECALL TESTS 


277 


8. Cotnpotind single-answer tests. The term just 
given is applied to tests which call for two or more 
responses, each of a different type. Thus quotations 
may be given and the author and source of each asked 
for, historical characters and the date of birth, na- 
tionality and field of activity required, and so forth. 
The two following tests show applications of this va- 
riety in literature and foreign language. 

The first consists of quotations from various com- 
monly studied plays of Shakespeare and calls for the 
name of the play from which each is taken and the 
character who spoke it. The second lists a number of 
German verbs, for each of which pupils are to give 
three common forms. 

Directions : Each of the following quotations is taken from 
one of Shakespeare’s. plays which you have studied. On the 
first line, that is, the one at the left under each quotation, 
write the name of the play from which it has been taken and 
on the second line, the one at the right, the name of the char- 
acter who spoke it. 

1. As you are old and reverend, you should he wise. 


2. Superfluity comes sooner by white hairs, but compe- 
tency lives longer. 


3. He that is stricken blind cannot forget 
The precious treasure of his eyesight lost. 


4. When he is best, he is a little worse than a man, and 
when he is worst, he is little better than a beast. 





278 TRADITIONAL EXAMNATIONS AND NEW-TYPE TESTS 

5. Those that are good manners at the court are as ridicu- 
lous in the country as the behaviour of the country is most 
mockable at the court. 


6. Confess yourself to heaven; 

Eepent what’s past; avoid what is to come. 


7. I know thou art religious, 

And hast a thing within thee called conscience, 
With twenty popish triclm and ceremonies, 
Which I have seen thee careful to observe. 


8. Thus conscience does make cowards of us all : 
And thus the native hue of resolution 
Is sicklied o’er with the pale cast of thought. 


9. I dare do all that may become a man : 
Who dares do more, is none. 


10. See what a rent the envious Casca made. 


^ Directions : Below you see a list containing the present ac- 
tive infinitives of ten verbs. After each there are three blank 
lines. On the first blank line after each verb write the first 
person singular present active indicative of that verb. On the 
second line after each write the first person singular imperfect 
active indicative of the same verb. On the third line write the 
past participle of the verb. To make sure you understand, this 
has been done for the verb “haben,” numbered 0. In writing 
the first two forms do not include the personal pronoun used 
as subject. 



SINGLE-ANSWEE OR RECALL TESTS 


279 


0. haben 

1. at^ehen 

2. anfangen 

3. brauchen 

4. diirfen 

5. einladen 

6. fubren 

7. laiifen 

8. lesezL 

9. reden 

10. treffen 

The only remaining illustration of single-answer 
tests which is given below perhaps belongs elsewhere 
yet it seemed to be slightly more appropriate to in- 
clude it here than under any other general type of the 
new examination. It calls for several responses to 
each item, the number varying from three to sis or 
more. Each response is either definitely right or 
wrong. It is usually best to make use of such a lan- 
guage or pronunciation test as this without allowing 
pupils the use of dictionaries. On other occasions if 
dictionaries are available and are of such a sort that 
pupils will not be able to copy their responses exactly, 
it may be well to permit their use. 

Directions: Divide each of the words in the list below into 
syllables by drawing short lines between the syllables. Also 
indicate the syllable of each word which receives the chief 
accent by means of the ordinary accent mark. For example, 
the word “eliminate” should be marked thus: e/lim'/i/nate. 

1. abominable 6. lamentable 

2. benevolence 7. miniature 

3. mrcumstanee 8. necessary 

4. diplomatic 9. precedence 

5. historian 10. unanimous 

9 . Summary- The single-answer or recall type of the 
new examination cannot be said to be new or unfa- 




280 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

miliar to most teachers. It consists of questions or ex- 
ercises calling for single-word or expression responses. 
This form of test allows place for pupil initiative, can 
be adapted to practically any kind of subject-matter, 
is perhaps the easiest of the new-type tests to give 
orally and possesses certain other • advantages. The 
common method of scoring such tests is to allow 
one point for each possible response. The following 
varieties of the single-answer type are enumerated, 
briefly discussed and illustrated: ordinary single- 
answer tests, tests each containing only one exercise, 
association tests, definition or description tests, single- 
example tests, multiple-example tests, and compound 
single-answer tests. 



CHAPTER XI 


MULTIPLE-ANSWEE TESTS 

I. Genersil discussion. The multiple-answer test, also 
known as the multiple-choice test, the best-answer test, 
the recognition test, and so forth, is one of the three 
or four most commonly used types of the new examina- 
tion. Its essential feature is that it suggests several 
answers to each exercise and requires the pupils to in- 
dicate one or more of these answers as correct. The 
number of answers suggested may vary from two up, 
four or five being most common, and seven very rarely 
exceeded. Inasmuch as multiple-answer tests with only 
two suggested responses lend themselves to certain 
forms not possible with those which contain three or 
more responses, and as the two-answer or alternative 
tests have received very wide use and are commonly 
thought of as more or less distinct from those with 
three or more possible answers, alternative tests have 
been treated separately and will be dealt with in the 
following chapter rather than in this. 

Multiple-answer tests may take a variety of forms. 
The most common of these are the direct question fol- 
lowed by several suggested answers and the incomplete 
statement with several suggested terms to complete it. 
It will be seen, however, from the following examples 
and the discussion thereof, that many ether forms of 
presenting such exercises are possible and appropiri- 

281 



282 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

ate. There is also considerable variation in the num- 
ber of answers given and in how many of these are 
to be selected by the pupils. It is most common to in- 
clude only one correct answer in the group of those 
suggested and, therefore, to have the pupils mark only 
one. Sometimes two or more right answers are given 
and the pupils are to indicate all of them. A third 
possibility is to have pupils select one response from 
several which vary in merit and in scoring to allow 
various numbers of points according to the degree of 
correctness of the one indicated. 

Tests of this type may be formulated for use in prac- 
tically every school subject and in almost every phase 
or division of each subject. They may be used to test 
not only knowledge of facts and amount of acquired 
information, but also knowledge of cause and effect 
relationships, ability to make comparisons, to evalu- 
ate, to apply, to illustrate, to define, and so forth. They 
are easier to prepare, and also to score, than some of 
the other types. Indeed, if, instead of underlining or 
otherwise marking the correct answers, pupils copy in 
a straight column figures or numbers which indicate 
the correct answers, this type is as easy to score as 
any variety of the new examination. Almost all kinds 
of multiple-answer tests can be constructed so that 
they possess practically perfect objectivity. 

It has been claimed that tests which suggest wrong 
answers thereby confuse the pupils and perhaps even 
tend to implant wrong ideas in their minds. It must 
be admitted that there is some danger along this line, 
but a careful consideration from the theoretical stand- 
point and such actual evidence as is available indicate 
that the danger is very slight. If such tests deal with 
material which the pupils have not studied, such con- 



MULTIPLE-ANSWER TESTS 


283 


fusion and erroneous ideas may easily be developed. 
If, however, pupils are reasonably familiar with the 
subject-matter covered, this result is unlikely to follow 
in any considerable degree and does not constitute a 
serious objection. 

y^^'^AjQother adverse criticism sometimes made against 
this type of test is that it offers a possibility of g^iess- 
ing the correct answers when they are not known. This, 
of course, is true, but not nearly so much so as in al- 
ternative tests and probably no more than in other 
types of the new examination. Furthermore, when 
there are a sufiScient number of answers suggested, the 
possibility of guessing is not great enough to be a se- 
rious objection.' If desired, allowance can be made for 
this possibility by the use of certain methods of scoring 
which will be explained later. 

It is probably true that multiple-answer tests do not 
develop the initiative of pupils as much as the single- 
answer and other types which give less help in decid- 
ing upon the answers. This, however, should not be 
considered an argument against the use of this va- 
riety of test but merely against its overuse. It very 
frequently occurs outside of school, both in child and 
adult life, that one is presented with a definitely re- 
stricted group of possibilities and must select from 
among these. On this ground it appears desirable to 
employ some tests in the school-room which present 
the same sort of situation. 

One practical disadvantage of multiple-answer tests 
in certain situations is that it is relatively difficult and 
unsatisfactory to administer most varieties of them 
orally. If a question or incomplete statement followed 
by fbur or five possible answers is read, it is usually 
difficult for pupils to keep all the answers of the group 



284 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

in mind long enough and clearly enongh to reach sat- 
isfactory decisions as to which are correct. Probably 
the most satisfactory method of giving such tests when 
it is not practicable to use multigraphed copies is to 
write the groups of answers upon the board previous 
to the time of testing and when actually giving the 
test to read the question, incomplete statement, or 
other similar portion of each exercise and indicate 
from which group of answers on the board pupils are 
to make their selections. 

On the whole multiple-answer tests tend to be some- 
what easier than single-answer, completion, and some 
other types. This is due to the fact that frequently pu- 
pils who cannot think outright of a desired response 
will recognize it when they have it before them. The 
fact that this type is easier than several other varieties 
should not in itself be considered either an advantage 
or disadvantage but merely a fact to be recognized, 
especially in comparing scores upon one kind of test 
with those upon another. By selecting answers which 
require different degrees of discrinoinative ability, one 
can regulate the degree of difficulty to a considerable 
extent. For example, if a teacher wishes to construct 
a multiple-answer exercise to ascertain knowledge of 
the date at which the Normans invaded and conquered 
England, she may use such an exercise as the follow- 
ing: “When did the Normans invade and conquer Eng- 
land? Ninth century, tenth century, eleventh century, 
twelfth century.” All that this exercise requires is 
that the pupils know the century in which the stated 
event happened. She may make it somewhat more 
difficult by substituting the following four possible 
responses for those used: “1015, 1044, 1066, 1085.” Pu- 
pils who knew that the event occurred within the elev- 



MULTIPLE-ANSWER TESTS 


285 


enth century miglit not have exact enough knowledge 
to choose the proper one of the four dates given. On 
the other hand, pupils who knew that the event oc- 
curred some time during the third quarter of the cen- 
tury, even though they did not know the exact year, 
would be able to select the correct answer. A still more 
difficult form would be presented if the four suggested 
answers were “1064, 1065, 1066, 1067,” since this 
would require pupils to know the exact year in order 
to receive credit for correct answers. 

The amount of time required by pupils to respond 
to multiple-answer exercises naturally varies with the 
form and difficulty of the exercises and with the ma- 
turity of the pupils. Different persons who have 
studied the question have suggested different rates. 
A consideration of these and of the writer’s experience 
in using such tests leads him to recommend that on 
the average elementary-school pupils be expected to 
respond to three or four such exercises per minute, 
and high-school pupils to four or five. These figures 
are based on the assumption that the exercises are 
so worded that difficulty of reading is not a hindering 
factor, and, furthermore, that they are not intended 
to be primarily speed tests. Some of the simplest and 
easiest forms can be given more rapidly than this, 
whereas others, such as those involving definitions 
and descriptions, will require considerably more time. 

In aU cases in which only one answer is to be indi- 
cated, and also in some others, the instructions to pu- 
pils should emphasize the point that the best answer or 
answers are to be selected. It may be due either to 
definite intention on the part of the teacher or to acci- 
dent that exercises contain one or more entirely cor- 
rect answers and others partially correct, and in such 



286 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

cases pupils should be directed to select and indicate 
the absolutely correct ones. 

There is opportunity for the use of considerable 
judgment and skill by teachers in the selection of the 
incoirrect answers used in multiple-choice tests. As in- 
dicated above, their selection will depend to some ex- 
tent upon how difficult it is desired to make a test. 
Incorrect answers should, however, never be obviously 
incorrect to a pupil who knows little or nothing of the 
matter dealt with nor apply to such an entirely dif- 
ferent topic that a pupil may recognize that they are 
wrong from his knowledge of other facts than those 
being dealt with. For example, it would ordinarily be 
undesirable to list only one Spanish city in an exer- 
cise which asked pupils to indicate the name of the 
largest city in Spain. Many pupils might know that 
the other dties were not in Spain even though they 
knew nothing about the cities of that country, and thus 
be led to mark the correct one. Instead such an exercise 
should usually contain the names of several cities in 
Spain. In general it is desirable if possible to have such 
incorrect answers that, if a pupil does not know the 
right one, he will tend to select one of the wrong ones 
if he makes any response at all. This principle can 
easily be carried too far, especially for young pupils, 
so as to confuse them about what they know fairly 
well, but its reasonable application is desirable. For 
example, if several English words were given as pos- 
sible meanings of a Latin word, it would be well to in- 
clude among them some English word which resembled 
the Latin word in spelling, though it was not the cor- 
rect translation thereof. Pupils who do not know the 
correct translation would be likely to select the word 
which resembles the Latin word in spelling. 



MULTIPLE-ANSWER TESTS 


28 T 


There has been considerable discussion and ar- 
gument concerning the method of scoring multiple- 
response tests. Most of this has centered about such 
tests as provide only two possibilities, that is, alter- 
native tests, and will be considered in the next chap- 
ter. There has, however, been a considerable amount 
about those which provide more than two possible re- 
sponses. Many persons have urged that the scores sim- 
ply be the numbers of exercises or items correctly an- 
swered. The chief argument advanced in support of 
this is usually that the correlation of scores so com- 
puted with those obtained from the use of any of the 
recommended formulae is so high that it is not worth 
the considerable amount of extra time and trouble 
required to use any one of the proposed formulae whidb 
allow for the effect of guessing. The formula most 
commonly advocated in ease pupils are to select just 
one answer out of each group is usually given in the 

W 

form : Score = E words, this means that 

N — 1 ’ 

the score equals the number of right responses minus 
the number of wrong responses divided by one less 
than the number of suggested answers. It will easily 
be seen that if there are four suggested answers, for 
example, a pupil’s score is computed by subtracting 
one-third of his wrong responses from his right re- 
sponses, that if there are five suggested answers the 
score is found by subtracting one-fourth of his wrong 
answers from his right ones, and so on. 

Among those who have studied the methods of scor- 
ing multiple-answer tests, Euch has probably given 
most attention to the question. Brief references to two 
of his articles dealing with this point will indicate his 
finding and conclusions. In one he and BeGraff (72) 



288 TRADinONAIi EXAMINATIONS AND ITEW-TYPE TESTS 

submit evidence part of which points to the following 
conclusions: The resulting scores when pupils are in- 
structed not to guess upon the items they do not know 
are both more reliable and more valid than those re- 
sulting when they are directed to guess. The use of 
the formula suggested above instead of the simple 
nxtmber right does not appear to increase the reliabil- 
ity of the scores but does increase their validity some- 
what. In the second article, by Foster and Ruch (25), 
more or less similar conclusions are reached. It is 
shown that validity is slightly greater if wrongs and 
omissions as well as rights are considered. It appears, 
however, that the use of the formula given above over- 
penalizes wrong responses to a slight degree. Miller 
(55) in an excellent article considers the matter care- 
fully and recommends a scoring formula supported by 
data he has gathered. The formula which he proposes 
is somewhat complicated, at least considerably more 
so than the relatively simple one already given and is, 
therefore, scarcely practicable for regular class-room 
use. 

Considering all the evidence available from the 
studies just mentioned and others similar to them, it 
appears that there is probably some increase in the 
validity of scores which may be derived by the use of 
the best known formula over scores derived from the 
W 

R — — j- formula, and likewise some increase in 

the validity of scores from the latter formula over 
those which are determined merely by counting the 
number of correct responses. The differences, however, 
are generally comparatively small. Furthermore it ap- 
pears that it has not been proven that there is any 
consistently greater reliability of scores computed by 



MULTIPLE-ANSWEE TESTS 


289 


one method than of those found by another. It is, 
therefore, recommended that for ordinary class-room 
tests the scores upon multiple-answer tests containing 
four or more suggested answers for each exercise or 
item be determined merely by counting the number 
of correct responses. For two- and three-response tests 
and perhaps, though probably not, for four-response 

W 

tests, the formula Score = E ^r= — r should be used. 

N — 1 

For a number of reasons which have already been 
referred to in this chapter, multiple-answer tests are 
one of the three or four types of the new examina- 
tion which merit the widest use both because of their 
adaptability to various bodies and phases of subject- 
matter, and because of their other advantages. In- 
deed, the statement is probably justified that on the 
whole the multiple-answer test is the best of the new 
types. It is, therefore, recommended that it be used 
at least as often as any other and probably somewhat 
oftener, though, because of the desirability of variety 
in the form of tests used, it should not be employed 
to the exclusion of the other types. 

2 . Ordinary multiple-answer tests. The ordinary form 
of a multiple-answer exercise calls upon pupils to 
select one correct answer out of a group of suggested 
answers to a direct question or of suggested expres- 
sions to complete a statement. Such exercises appear, 
however, in many other forms and measure knowledge 
and relationship of various sorts. The first example 
given below, which is in the field of agriculture, repre- 
sents what is undoubtedly the most frequently used 
form of multiple-answer test. Each exercise therein 
consists of a direct question followed by several sug- 
gested answers of which the correct one is to be un- 



290 TRABITIOKAL EXAMIITATIOlSrs ANB NEW-TYPE TESTS 


derlined. In the example given there are five possible 
answers listed. 

Directions: You will find below a list of twenty questions 
on agriculture. Each question is followed by five words, num- 
bers, or other possible answers. One of the five answers after 
each is right and the other four are wrong. Draw a line under 
the correct answer to each question. For example : 

Which of the following crops grows best in a cool climate ? 
com, rice, tobacco, wheat, sugar cane 

The correct answer is wheat, so it is underlined. Go ahead ! 

1. Of what fowl are goslings the young? turkeys, geese, 
ducks, guineas, pigeons 

2. Wbich is a variety of muskmelon? Ponderosa, Golden 
Bantam, Ruby King, Big Boston, Rocky Ford 

3. Which is a breed of turkeys? Bronze, Orpington, Mi- 
norca, Cayuga, Wyandotte 

4. What is the minimum weight for a ‘'good^' carcass of 
beef? 750, 1000, 1250, 1500, 1750 

5. Below how many months is a sheep called a lamb ? 6, 8, 
10, 12, 14 

6. From what can good hay be made ? alfalfa, barley, wheat, 
rye, oats 

7. What cloth is made from flax? buck, denim, silk, linen, 
serge 

8. What is a record of the ancestors of an animal called? 
register, registration, pedigree, certificate, breed-book 

9. In how many days do chicken eggs usually hatch? 21, 
25, 28, 31, 35 

10. How many crops are used in ordinary rotation? 2, 3, 
4,5,6 

11. In what kind of weather does spinach grow best ? very 
hot, hot, warm, cool, cold 

12. When should the pit of a hotbed be dug? January, 
March, May, August, November 

13. Of what is the Neapolitan a variety? tomato, turnip, 
beet, pepper, cucumber 

14. Of what is the Earliana a variety? corn, tomato, bean, 
pea, radish 

15. How many hens are in an exhibition pen ? 2, 3, 4, 5, 6 



MULTIPLE-AKSWER TESTS 291 

16. How many inches of perch per fowl should be allowed 
for leghorns? 7, 10, 13, 16, 19 

17. At how many months of age should a hog usually be 
marketed? 3, 6, 12, 18, 24 

18. What country is especially noted for producing fast 
horses? Arabia, Belgium, Germany, Italy, Turkey 

19. How many bushels is the average yield of oats per acre 
for the United States ? 20, 25, 30, 35, 40 

20. To what length in feet do alfalfa roots sometimes ex- 
tend? 4, 8, 12, 16, 20 

As was suggested in connection with single-answer 
tests, considerable time is saved in scoring if the cor- 
rect answers are written in a straight col umn rather 
than underlined wherever they happen to occur upon 
the page. Slightly more time is required on the part 
of the pupils when taking the test to do this and per- 
haps there is more danger of their copying an answer 
which they do not intend than of their underlining the 
wrong one. It seems probable, however, that the time 
saved in scoring is an advantage which more than com- 
pensates for the time required of the pupils and the 
possible danger of an occasional mistake. This form 
is shown by the following test. 

Directions : Bach of the questions below is followed by five 
possible answers of which one and only one is right. For each 
question to which you know or think you know which one of 
the five suggested answers is correct, write it upon the line 
in front of the question. 

1. "Which applies to the principal char- 
acter in Treasure Island? merchant, 
boy, lawyer, army oflScer, Frenchman. 

2. "Who wrote The House of Seven Go- 

llesf Irving, Thackeray, Poe, Haw- 
thorne, Stevenson, 

3. Who was the author of The Passing 

of Arthur? Tennyson, Browning, 
Goldsmith, Kipling, Macaulay. 



292 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


4. In. whieh selection is Gabriel an im- 
portant character? Hiawatha, Treas- 
ure Island, Evangeline, Lorna Hoone, 
The Scarlet Letter. 

5. In -wbieh country does most of the 

action of Ivanhoe occur? Scotland, 
Ireland, France, Wales, England. 

6. Who was the author of the “Sir Roger 
de Coverley” papers? Steele, Pope, 
Addison, Bacon, Johnson. 

7. In which story was Roderick Dhu an 

important character? Ivanhoe, The 
Lady of the Lake, The Prisoner of 
Chillon, The Vision of Sir Launfal, 
The Talisman. 

8. What is the name of the merchant in 

The Merchant of Venice t Orlando, 
Bassanio, Prospero, Antonio, Oliver. 

9. Who was the author of “II Pense- 

roso”? Bunyan, Bacon, Addison, Tay- 
lor, Milton. 

10. What is the theme of Tennyson’s 

“Crossing the Bar”? shipwreck, 
death, love, happiness, ambition. 

Instead of arranging the suggested answers as in 
the two previous examples, it is sometimes convenient 
to use a different form. One such possibility is illus- 
trated by the following test in geography. It will be 
seen that in this the four suggested answers are placed 
to the right of each question and in column one be- 
neath another. This test and the one following it also 
illustrate a variation in the mode of pupil response. 
Each of the suggested answers is lettered or numbered 
and, instead of copying the whole answer in each case, 
the pupils merely place the proper letter or number on 
the blank line in front of each exercise. It requires 



MULTIPLE-ANSWEK TESTS 


293 


less time for pupils to do this than to write out the 
complete answer. There is, however, probably a 
slightly greater chance of error in copying. On the 
whole there appears to be little difference in the de- 
sirability of having pupils copy the actual answer or 
write a letter or number designating it. A teacher 
should, however, ordinarily, perhaps even always, em- 
ploy either one form or the other rather than mix 
the two, since pupils will thus become accustomed to re- 
sponding in just the same way and not incur the dan- 
ger of confusion because of change. The same practice 
should also be followed with regard to having the pu- 
pils either underline the correct answers or copy them 
in a column, that is, they should always do one or the 
other, rather than sometimes one and sometimes the 
other. 

Directions : Each of the questions given below is to be an- 
swered by selecting one of the four answers su^ested at the 
right-hand side of the sheet. In each ease the letter in front of 
the correct answer should be written upon the short blank 
line in front of the number of the question. For example, the 
answer to the question numbered 0 is Texas. Therefore, since 
Texas follows the letter b, “b” should be written upon the 
short line in front of the question. 

h 0. Which state has the largest a. California 
area? b. Texas 

c. Wyoming 

d. Montana 

1. What eflEect do high moTmtains a. heavy rainfall 

produce on the climate of the b. light rainfall 
territory near to them but on c. no marked change 
the opposite side from that d. lower temperature 
whence the prevailing winds 
blow? 



294 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


2. About how many degrees from 

the equator are the Horse 
Latitudes” located? 

3. What is the general direction 

of the Northern Trade Winds ? 


4. Which state has the densest 
population ? 


5. Which state as a whole re- 
ceives the largest rainfall? 


6. Which state contains the larg- 
est coal fields? 


7. Which state gives most atten- 
tion to raising tobacco? 


8- Which state has the highest 
average yield of all grains per 
square mile ? 

9. In which general direction has 

the center of population of the 
United States been moving? 

10. Of which does Peimsylvania 

mine most? 


a. 10 

b. 22% 

c. 33 

d. 45 

a. southwest 

b. southeast 

e. northwest 
d. northeast 

a. California 

b. Indiana 

c. Pennsylvania 

d. Massachusetts 

a. Florida 

b. North Dakota 

c. Arizona 

d. Maine 

a. Illinois 

b. Texas 

c. South Carolina 

d. Utah 

a. Georgia 

b. Missouri 

c. Wisconsin 

d. Virginia 

a. Ohio 

b. Iowa 

c. Arkansas 

d. Wyoming 

a. southwest 

b. southeast 

c. west 

d. northwest 

a. lead 

b. iron 

c. silver 

d. coal 



MUITIPLE-ANSWEB TESTS 


295 


A variety of tlie multiple-answer test, sometimes 
called a similarities test and sometimes an associa- 
tion test, is illustrated by the following example in zo- 
ology. Instead of each exercise consisting of a direct 
question, two or more names of objects or terms which 
are in some way similar are given. These are followed 
by several others, some one of which is similar to the 
first two or more. This one is, of course, to be selected 
in each case. 

Directions: After each of the numbers given below you 
will find the names of two objects or two terms of some other 
sort which are alike in some way. After these are four terms 
within a parenthesis. One of these four terms is much more 
like the first two than the other three. Select this term which 
is most like the other two in each case and write its number 
on the blank line in front of the number of that line or exer- 
cise. For example, in the exercise numbered 0, the figure 4 
appears on the blank line because a lion is more like a cat and 
a tiger than is any of the other animals named. 

^ 0. cat, tiger (1. dog, 2. horse, 3. bear, 4. lion) 

1. valve, auricle (1. diaphragm, 2. ventricle, 3. lymph, 

4. corpuscle) 

2. fly, bee (1. spider, 2. flea, 3. centipede, 4. cricket) 

3. wheat, rye (1. alfalfa, 2. clover, 3. flour, 4. oats) 

4. dog, hyena (1. cat, 2. mouse, 3. goat, 4. wolf) 

5. panther, lynx (1. bear, 2. wolf, 3. leopard, 4. dog) 

6. pine, cedar (1. oak, 2. fir, 3. maple, 4. chestnut) 

7. mohair, wool (1. silk, 2. cotton, 3. linen, 4. fibre silk) 

8. carrot, salsify (1. cabbage, 2. tomato, 3. lettuce, 4. beet) 

9. whale, seal (1. shark, 2. walrus, 3. pike, 4. cod) 

10. alfalfa, clover (1. com, 2. wheat, 3. pea, 4. radish) 

It bas frequently been said that it is difficult to test 
spelling by the use of the new examination. It is un- 
doubtedly true that this subject cannot be as easily 
and perhaps as satisfactorily measured by most varie- 



29e TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

ties of new-type tests as can many others. It is, how- 
ever, possible to do something with it. The following 
test presents four possible spellings of each of a num- 
ber of words of which one is correct. A possible varia- 
tion would be to have each line contain several differ- 
ent words of which only one was spelled correctly or, 
probably much better, only one incorrectly. 

Directions : In each line below you will find four different 
spellings of a single word. Only one of these is correct. Look 
them over carefully and draw a line under the correctly 
spelled word in each line. For example, in the first line the 
second spelling, ‘‘accommodate,’’ is correct, so it has a line 
under it. Now go ahead and do the others. 

1. accomodate accommodate accommadate acommodate 

2. accrued accrewed acrued aecrude 

3. bouket boquet bouquette bouquet 

4. collaterral colatteral collateral colateral 

5. discipline dissipline dicipline discupline 

6. efSsciency efficiency eficiency efficency 

7. gauranteed guarranteed guaranteed gauraunteed 

8. horrorble horroble horible horrible 

9. intellegent intelligent inteligent intellegint 

10. kimona kimono kimmono kimonno 

11- leiutenant lieutennant lieutenant lieutenent 

It has also sometimes been said that it is difficult 
to test reading ability satisfactorily with new-type 
tests. The falsity of this assertion has been amply 
proven by the large number of standardized tests in 
this subject and the fact that these tests employ many 
different forms of objective exercises. The following 
three examples illustrate three of these forms which 
can readily be employed by the class-room teacher in 
constructing her own reading tests. The first of these 
is a word recognition test, intended, of course, for first- 
grade children. Each line contains five syllables, one 



MULTIPLE-ANSWER TESTS 


297, 


of which is a word. The second of this group of tests 
consists of a number of short paragraphs, each fol- 
lowed by a question and four suggested answers. Some- 
times it is desirable to have several questions after 
each paragraph. The third example likewise calls for 
the reading of paragraphs, but, instead of asking a 
question which can be answered from the content of the 
paragraph, requires the selection of the best one of 
four statements of the central thought or chief idea 
brought out therein. Tests such as these dealing with 
the understanding of continuous material are appro- 
priate for measuring both general reading ability and 
ability to read the subject-matter of particular fields. 
A very frequent cause of difliculty and even failure 
is inability to read the assigned lessons comprehend- 
ingly and with fair speed. By constructing tests of 
the sort illustrated a teacher can determine whether 
or not this cause is operative and thus be in a posi- 
tion to deal with the situation more intelligently and 
effectively. Such tests may be used in literature, his- 
tory, science, and indeed in every subject which de- 
mands any considerable amount of reading. They are 
also one of the best ways of measuring ability to un- 
derstand a foreign language. When used for this pur- 
pose, a selection in Latin, French or whatever the lan- 
guage is may be followed by questions or suggested 
responses in either that language or English. 

Directions : ^ In each list on your paper there is one real 

1 The directions for this test and for any others intended for first-grade children 
should not be on the test paper since the pupils will be unable to read them and 
will merely be confused by thmr being there. They are, of course, to be given 
orally by the teacher. The same procedure is probably ordinarily desirable during 
the first portion of the second grade also, and sometimes above that. As has been 
stated elsewhere, the directions actually given upon test papers should practically 
always be read aloud by the teacher in the lower and intermediate elementary 
grades, and frequently in the upper elementary and high-school grades. 



298 TEABITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

word. You are to find the real word in each list and draw a 
line under it. Now look at the first list at the top of the paper. 
Who sees a real word in that list ? As soon as you find a real 
word in it tell me what it is. . . . Yes, the real word in the 
fost list is “he-’’ Where is it? . . . Yes, it is right at the be- 
ginning of the list. Now take your pencils and make a little 
mark under “he,” like this (illustrate on blackboard by writ- 
ing “he”). Now look at the second list. Who sees a real word 
in that? . . . Yes, it is ^'boy.” Where is it? . . . Yes, it is 
right in the middle of the list. Take your pencils and make a 
little mark under “boy” like this (illustrate on blackboard by 
writing “boy”). Now stop looking at your papers for a mo- 
ment and look at me. In each list on your paper there is just 
one real word. I want you to find this word in each list and 
make a little mark under it just as you did under “he” in 
the first list and “boy” in the second list. Start in with the 
list just under the one which has “boy” in it. When you have 
done that, look at the next list under that, then the next one 
under that, and so, on down to the bottom. Ready, Go ! (If the 
chhdren appear to be spending too much time looking for the 
word in one list when they cannot find it, tell them to skip 
the lists in which they cannot find any real words.) 

1. he, ab, di, ne, sud. 

2. gir, lat, boy, hi, ka. 

3. me, mo, ca, bi, tu. 

4. ob, do, af, uk, ot. 

5. ake, fud, and, gen, hok. 

6. ul, et, lu, go, sa. 

7. im, nu, ka, en, at. 

8. bu, on, id, ne,* po. 

9. ta, hu, it, ci, ud. 

10. is, zo, wi, es, ru. 

11. seh, gla, mon, yec, she. 

12. nac, mib, can, kol, rhu. 

Directions : After the signal to start is given you are to read 
each of the paragraphs below and then the question after each 
paragraph. You will see four names or words after each ques- 
tion. One of these is the right answer to the question and the 
other three are wrong answers. You are to draw a line under 



MULTIPLE-ANSWER TESTS 


299 


the word which is the right answer. Read the paragraphs 
carefully so that when you are through you will know the 
right answer to each question, but if you do not know it, look 
up at the paragraph again and read as much of it as you have 
to to find the right answer. When you have drawn a line un- 
der the right answer to each question, go ahead and read the 
next paragraph and find the answer to the question after it, 
and so on. Ready, Go! 

1. The children who lived in the town of Smithville went to 
school at a little two-room frame schoolhouse next door to 
Miss Short’s. The four lower grades were in one room and 
the four upper in the other. The teacher in the first room 
was Miss Brown, in the second Miss Jones. 

Who was the teacher of the third grade? 

Miss Smith Miss Brown. Miss Jones Miss Short 

2. John and Tom went hunting one day last fall. They saw 
six rabbits. John shot at four of them and Tom shot at two. 
Each of the boys missed one of the rabbits at which he shot. 

How many rabbits did the two boys hit? 
two three four five 

3. Mary and Susie were walking along the sidewalk. They 
met Jennie and Alice and asked them to go along. The first 
said she would, but the second said that she had promised 
Helen to go to her house and must do so. 

Which one of the girls would not go walking with the 
others ? 

Jennie Alice Susie Mary 

4. John went to visit his Uncle Henry and his cousins Tom 
and Paul. "WTiile he was there he helped shock oats and 
plow corn. Sometimes also he would feed the cattle and 
horses. 

Which one of the. persons mentioned went to visit in the 
country? 

John Paul Henry Tom . 



3D0 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


5. When Will was a boy he lived in a house on a steep bank 
above the water. He liked to look across and watch the peo- 
ple on the other side. Sometimes he even went across the 
long bridge to see some of them. Also he liked to watch the 
things which floated down in the current. 

By the side of what body of water did Will live when a boy ? 
lake river creek ocean 

Directions: You will find below several paragraphs with 
four sentences or statements after each. You are to decide 
which one of ihese four statements best tells the chief thought 
of the paragraph, that is, which best tells what the paragraph 
is about. "V^en you have decided which statement does this, 
draw a ring around the letter in front of that statement. Then 
go on to the next paragraph and do the same thing with it, 
and so on. 

1. Before be^ning the study of sociology it is important 
to define what is meant by the term society. It is commonly 
used to refer to a number of more or less permanent groups 
of people, but for scientific use a more exact definition is nec- 
essary. It implies collective or group life, but observation 
shows us that such life is not peculiar to human beings alone, 
yet we do not usually apply the word “society’’ to a group of 
trees or other plants, nor to a collection of mineral specimens 
or postage stamps. It is necessary, therefore, to seek for other 
characteristics which will limit the meaning of the term. 

a. A society is a more or less permanent group of human 

beings. 

b. A society is a group of human beings carrying on a com- 
mon life by means of conscious relations. 

c. For scientific study it is necessary to define the term 

society exactly. 

d. The popular meaning of the word society will be used 

in tiiis book. 

2. Among the early Romans, as was likewise true among 
practicaUy all peoples, there was a close connection between 
family life and religion. Apparently the family in early Rome 
had as one of its chief purposes the perpetuation of ancestor 



MULTIPLE-ANSWER TESTS 


301 


worship. This purpose did not originate the family, but from 
very early times received a great deal of attention. The prac- 
tice was reinforced if not caused by the belief of the Romans 
in the existence of the soul after death and in the necessity of 
rendering the proper honors to the dead to prevent their 
souls from being unhappy and also to prevent them from re- 
turning to torture the living relatives. The Roman family 
was, of course, patriarchal and therefore the father of the 
household was its priest and performed the proper religious 
rites, usually in the presence of the whole family. 

a. The religious rites of ancient Roman fa mili es were per- 

formed by the father of the household. 

b. The Romans believed in the existence of the soul after 

death. 

e. The early Roman family was patriarchal in type. 

d. The worship of ancestors was very closely bound up 
with the family life and organization of the early Ro- 
mans. 

In many subjects the multiple-answer test can be em- 
ployed to test knowledge of definitions. This can be 
done either by giving one term with several possible 
definitions or one definition with several possible terms 
to which it applies. The illustration given below shows 
how the former type may be used in geometry. 

Directions: You will find below ten terms used in geometry. 
After each term are four possible definitions for it. Only one 
of these definitions is entirely correct. Draw a line under the 
letter in front of the one definition of each four which is en- 
tirely correct. 

1. Plane 

a. A limited portion of space. 

b. A surface such that if any two points in it are con- 

nected by a straight line that line lies wholly within 
the surface. 

c. A portion of surface bounded by four straight lines. 

d. The intersection of two rectangular solids, both of 

which are resting upon the same fiat surface. 



802 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

2. Hs^potentise 

a. The longer leg of a right triangle. 

b. The side of an isosceles triangle at the base, 
e. The longest side of a right triangle. 

d. The sum. of the squares of the two legs of a right tri- 
angle. 

3. Obtuse angle 

a. An angle less than 45“ 

b. An angle between 45“ and 90“ 

c. An angle between 90“ and 180“ 

d. The largest angle of any triangle. 

4. Supplementary angles 

a. Two angles whose sum is a right angle. 

b. Two angles opposite each other in a quadrilateral. 

c. The two smallest angles in a right triangle. 

d. Two angles whose sum is a straight angle. 

5. Inscribed circle 

a. A circle within a polygon each of whose sides is tangent 
to it. 

b. A circle within another circle having the same center. 

c. A circle which contams a polygon each of whose sides 

is a chord of the circle. 

d. A circle which contains the same area as a given 

polygon. 

6. Isosceles triangle 

a. One in which two sides are equal. 

b. One in which the three sides are equal. 

c. One containing two acute angles. 

d. One in which the base equals the altitude. 

7. Complementary angles 

a. Two angles whose sum is a right angle. 

b. Two angles whose sum is a straight angle. 

0 . The two base angles in an isosceles triangle, 
d. The two smallest angles in a right triangle. 

8. Straight angle 

a. An angle of 360“ 

b. An angle of 270“ 



MDLTIPI*E-ANSWER TESTS 


303 


C. Aa angle of 180“ 

d. An angle of 90* 

9. Locns 

a. Any single point. 

b. The intersection of two lines. 

c. All points that completely fulfill given conditions. 

d. The path of any moving point. 

10. Vertex 

a. The distance between the sides of an angle. 

b. The smallest angle of a triangle. 

c. The highest angle of a triangle. 

d. The point at which the sides of an angle meet. 

Another possibility of the multiple-answer test which 
has not been illustrated previously is its use in con- 
nection with cause and effect or cause and result. In 
the example given below, taken from American his- 
tory, each exercise includes one effect or result and 
three causes which contributed to it, and the pupils 
are to select the one result. The items may be reversed, 
with one cause and several results, or there may be 
perhaps more than one of each. In addition to history 
and the social sciences, cause and effect exercises are 
distinctly appropriate and worth-while in the other 
sciences also, as well as sometimes elsewhere. Occa- 
sionally the name classification test is given to this 
variety. 

Directions : In each group of four events given below there 
are one result and three causes which contributed to bringing 
about this result. Look over each group of four, decide which 
one is the result, and draw a line under it. 

1. Fall of Constantinople, discovery of America, invention 

of the compass, revival of learning. 

2. Napoleonic wars, imprisonment of seamen. War of 

' 1812-14, election of Henry Clay to the House of Repre- 

. sentatives. 



304 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


3. Panic of 1894, election of Mc Kin ley, split between Cleve- 
land and Bryan., issuing of gold bonds in time of peace. 

4. Mexican war, election of Polk, annexation of Texas, ex- 
pansion of the South. 

5. Defeat of St. Leger, inactivity of Clinton, surrender of 
Burgoyne, battle of Saratoga. 

6. Religious persecution in England, settlement of Plym- 
outh, dissatisfaction with living in the Netherlands, de- 
sire for political independence. 

7. Conquest of Mexico, desire for gold, voyages of Coliunbus, 
character of Cortez. 

8. First election of Lincoln, division of Democratic party, 
Lincoln-Douglas debates, opposition to Seward. 

9. War with Spain, purchase of the Philippines, sinking of 
the Maine, Spanish misrule in Cuba. 

10. Death of Hamilton, election of Jefferson, purchase of 
Louisiana, downfall of Federalist party. 

A different variety of the so-called classification 
test is illustrated by the following example. It con- 
sists of a number of words, five in this case, of which 
the majority are alike in some way. In the example 
given four out of five are alike, leaving only one dis- 
similar word. Such a test differs from somewhat sim- 
ilar ones already given in that the basis of classifica- 
tion in each case is not stated but must be determined 
by the pupils. If more than one dissimilar or extra- 
neous word is included in each group, the test is ren- 
dered very much more difficult and it is doubtful if 
this is desirable in the elementary grades and prob- 
ably not very often even in high school. Because of the 
fact that it is common to have the extraneous word 
crossed out, the name cross-out test is sometimes ap- 
plied to this form. 

Directions : After each number below you will find a group 
of five terms or names with which you should be familiar. 
Four of eadi five are alike in. some way, and the fifth one is 



MULTIPLE-ANSWER TESTS 


305 


different from these four. Draw a line through the one word 
in each group which is different from the others. For example, 
if you had a group containing eggplant, carrot, salsify, ««t8( 
lettuce, oats should be crossed out as shown because the other 
four are garden vegetables and it is not. 

1. Tamworth, Berkshire, Poland China, Guernsey, Hamp- 
shire. 

2. Leghorn, Brahma, Orpington, Wyandotte, Merino 

3. Arabian, Galloway, Clydesdale, Belgian, Percheron. 

4. clover, soy beans, alfalfa, timothy, cowpeas. 

5. Bordeaux mixture, paris green, arsenate of lead, kerosene 
emulsion, hellebore. * 

6. cabbage worm, lice, potato beetle, army cutworm, 

tent caterpillar. 

7. barley, rye, oats, alfalfa, wheat. 

8. hard, flint, sweet, pop, dent. 

9. rust, loose smut, stinking smut, Hessian fly, beetle. 

10. corn, roughage, wheat, oats, bran. 

AJl of the examples of multiple-answer tests so far 
given have, in several characteristics, been of the type 
most commonly employed. One of these is that only 
one of the suggested answers to each is correct and all 
others incorrect. Therefore, in scoring no credit should 
be given unless the one correct answer in each case 
is designated. A possible and occasionally though not 
usually desirable variation is to give answers of vary- 
ing degrees of merit. Pupils are still instructed to 
mark only one answer. If they mark the best one 
they receive a certain number of points credit, perhaps 
five ; if they mark another one not so good yet of consid- 
erable merit, they receive perhaps four or three points ; 
for another still worse two or one, and of course no 
credit at all for one absolutely incorrect. Because of 
the difficulty in determining relative weights to be al- 
lowed answers of . varying degrees of merit and the 
increased difficulty of scoring even if these weights 



306 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


liave "been determined, and becanse there seem to be 
no advantages to compensate for these disadvantages, 
this type of test has received comparatively little nse. 
A single example taken from the field of cooking is, 
however, given to illustrate it. In this example there 
are four possible answers for each exercise and in 
most cases some credit might well be allowed for at 
least three of the four. It will be noticed also that this 
test differs from most of the others already given in 
that the exercises are not direct questions, but may be 
thought of as being in completion form. This form 
is probably as frequently appropriate for multiple- 
answer tests as is the direct question, perhaps even 
more so. 

Directions: You will find below a number of statements in 
each of which is a parenthesis containing four words, num- 
bers, or phrases. Some one of these words, numbers, or phrases 
completes the statement so as to make it truer than does any 
one of the other three. Determine which of the su^ested terms 
makes each statement the truest and write the letter in front 
of it on the blank line in front of the statement. 

1. There is much protein in (a. soy beans, b. wheat flour, 

c. com meal, d. potatoes). 

2. Starch is a prominent constituent of (a. bananas, b. ap- 
ples, c. potatoes, d. nuts) . 

3. Much of the Titamln directly affecting body growth 

is found in (a. milk, b. eggs, c. oatmeal, d. spinarfi^. 

4. A fish containing a large amount of fat is the (a. cod, 

b. halibut, c. salmon, d. pickerel). 

— 5. An egg of average size yields about (a. 45, b. 55, c. 65, 

d. 75) calories. 

6. The temperature of fat suitable for deep frying is 

about (a. 450, b. 350, c. 250, d. 150) degrees. 

7. In making graham, bread there should be one cake of 

yeast for every (a. 2, b. 3, c. 4, d. 5) cups of flour, 

8. A food, containing a great deal of iron is (a. lettuce, 

■ b. ’spiuach, A fcauliflower, d. oYaUge). 



MULTIPLE-ANSWER TESTS 


307 


9. About (a. 3, b. 4, c. 5, d. 6) pints of ordinary borne- 

made meat juice are sufficient to feed an invalid for 
a day. 

10. The (a. chuck, b. brisket, c. round, d. flank) is the 

best cut of beef. 

The next four tests are examples of a variation of 
the ordinary multiple-answer type in which, instead of 
having a group of several possible answers suggested 
for each question or exercise, there is a single group 
for a whole list of questions or exercises. The first il- 
lustration consists of statements of a number of differ- 
ent situations or problems in arithmetic, none of which, 
however, contains any figures. The pupils are directed 
to mark each exercise with the proper sign to indi- 
cate which one of the four fundamental operations 
should be applied. Sometimes instead of using the 
signs the initial letters of the four operations are em- 
ployed. To make such a test more difficult the problems 
or statements can be made more complicated so that 
two or more operations are needed in each and the 
pupils required to indicate these. 

Directions : On the blank line in front of each of the exer- 
cises given below place the proper sign to show what opera- 
tions should be performed. If you should add, place a plus 
sign (-1-) on the blank line in front of the exercise; if you 
should subtract, place a minus sign ( — ) there; if you should 
multiply, place a multiplication sign ( X ) ; and if you should 
divide, place a division sign (^). To show you just what you 
are to do, a minus sign has already been placed in front of 
No. 1 because you should subtract to do what exercise No. 1 
calls for. Go ahead and do all the others in the same way. 

— TT" 1. Given sum of two numbers and one of them to find 

the other. 

2. Given lengths of two rivers to find the difference. 

3- Given number of acres in a field and total yield to 

find yield per acre. 



308 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

4. Given population of each of four cities to find total 

population. 

5. Given number of days’ work and wages per day to 

find total wages. 

6. Given principal and interest to find rate. 

7. Given improper fraction to reduce to mixed num- 
ber. 

8. Given cost of lumber and other materials and of 

labor to find cost of house. 

9. Given cross section and height of tank to find ca- 
pacity. 

10. Given rent received and cost of upkeep to find in- 
come from house. 

11. Given assessed valuation and tax rate to find the 

amount of taxes. 

The same type of test may also be used in physics. 
This is illustrated by the figure and five questions 
given below. The answer to each of these questions is 
one of the numbers given just above the questions. 

Directions : You will see on the paper below a figure which 
shows a lever AB balanced at the point C with a weight at- 
tached at one end, B. The length of each portion of the lever 
and the amount of weight suspended at B are shown on the 
figure. Read each of the five questions given below, then look 
at the figure and determine what the correct answer is. In 
each case the answer is one of the numbers given immediately 
below the figure but above the questions. Copy the proper 
number on the blank line in front of each question. Use the 
same number twice if necessary. 


B 6ft. C 24ft. A 



1 2 2^4 4 5 7 % 10 





MULTIPLE-ANSWER TESTS 


309 


1. How many pounds must be suspended at A just to 

balance the weight at B, if no aUowanee is made for 
the weight of the lever AB ? 

2. How many feet must A be lowered to raise B one 

foot? 

3. How many foot-pounds of work are done in raising 

B one foot ? 

4. How many feet to the left must C, the fulcrum, be 

moved that a weight of 2 lbs. at A will just balance 
the 10 lbs. at B ? 

5. How many feet to the right must 0 be moved that a 

weight of 5 lbs. at A will just balance the 10 lbs. 
at B? 

A third example of the same variety is from the 
field of medieval and modern history. It presents a 
number of events which are to be classified according 
to the one of five periods of time in which each oc- 
curred. 

Directions : Immediately below these directions you will see 
the letters A, B, C, D, and B, each followed by the limits of 
a certain period of time, that is, by a statement of the event 
with which the period began and also that at which it ended. 
Below these five statements are the names of a number of 
events. You are to indicate the period of time within which 
each event occurred by placing the proper letter on the blank 
line in front of each. For example, event no. 1, the Abolition 
of Serfdom in Eussia, occurred during the period “B,” from 
the French Eevolution to the Present, therefore an ‘^B^’ is 
placed in front of it. Go ahead and do the same for the whole 
list of events- 

A. Prom the Accession of Charlemagne to the Battle of Hast- 
ings. 

B. Prom the Battle of Hastings to the Pall of Constantinople. 

C. Prom the Pall of Constantinople to the Defeat of the 
Spanish Armada. 

D. Prom the Defeat of the Spanish Armada to the French 
Eevolution. 

E. From the French Eevolution to the Present. 



310 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


1. Abolition of Serfdom in Bnssia 

2. Battle of Bannockburn 

3. Conquest of Mexico 

4. Discovery of Greenland by the Norsemen 

5. Conquest of Saxons by Franks 

6. Execution of Charles I of England 

7. Execution of Joan of Arc 

8. Expulsion of the Moors from Spain 

9. Expulsion of the Stuarts from England 

10. First Voyage around the World 

11. Holy Alliance 

12. Last Crusade 

13. Liberation of Italy 

14. Beign of Edward the Confessor 

15. Beign of Frederick Barbarossa 

16. Beign of Henry VIII of England 

17. Beign of Louis XIV of France 

18. Separation of Norway and Sweden 

19. Signing of Magna Cbarta 

20. Thirty Years’ War 

21. Battle of Leipzig 

3. Plural multiple-answer tests. The name plural 
multiple-answer tests, or sometimes plural-choice tests, 
has been suggested for those similar in form to the 
examples contained in the last few pages, but differ- 
ent in that pupils are not instructed to indicate only 
one of the suggested answers. Sometimes they are told 
to mark a certain number, such as two or three, or 
perhaps even more, of the answers given, but probably 
more often merely to indicate all of the answers which 
are correct, this number varying in the different ex- 
ercises of a test from one up to several. Such tests 
are generally more difficult than those in which pu- 
pils are to select only one of the suggested answers 
and require on the average more time per exercise. 
The difference in difficulty is not, however, so great 



MULTIPLE-ANSWER TESTS 


311 


that one should feel any hesitation in using such tests 
in almost any situation in which an ordinary multiple- 
choice test is not too hard. 

Plural-choice tests may resemble in form and ar- 
rangement any of the varieties of ordinary multiple- 
answer tests ordinarily given, but in view of the limi- 
tations of space it has seemed best not to give as many 
examples as there might be slight variations in form 
and arrangement. The first of those given below pre- 
sents a number of quadratic equations with five pos- 
sible values of the unknown following each. Pupils are 
instructed to indicate the one or more values which 
are correct. In this case, of course, there cannot be 
more than two correct answers. Following this is a 
test which presents four translations of each of a 
number of Latin sentences. In each case one or more of 
the translations is correct. 

Directions: You will find below ten quadratic equations 
each of which involves only one letter or unknown. Following 
each equation are five possible values of the unknown. Find 
the one or more correct values after each equation and write 
it, or them, on the blank line in front of the equation. Be sure 
not to omit minus signs. Do whatever figuring and other work 
is necessary on other paper. 

1. If -f 4x = 12, X = (2, -2, 4, 6, —6) 

2. If 2x* + 12x -j- 18 = 50, X = (2, —2, 4, —4, —8) 

3. If 5x* — lOx = 0, X = (0, 1, 2, 3, 4) 

4. If X* — 16x -}- 32 = 17, X = (1, —1, -5, 15, -15) 

5. n 3x®-f24x— 21 69, x=(2, — 2 4, — 4, 8) 

6. If 4y* + 24y + 144 = 124, y =(1, -1, 5, -5, 15) 

7. If y*-f-8y-4 = 80, y = (6, -6, 8, -10, 12) 

8. If y*-17 = 64, y = (8, -8, 9, -9, 12) 

9. If 4y* -1- 16y - 2 = 18, y = (1, -1, 2, -2, 5) 

10. If 2x* — 4x — 2 = 4, x = (l, —1, 2, —2, 3) 

Directions : At least one, and in many eases more than one, 
of the four translations of each Latin, sentence given below is 



312 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

correct. Draw a circle around the letter in front of each cor- 
rect translation. 

1. Caesar tres homines qui terram explorarent misit. 

a. Caesar sent the three men to explore the land, 

b. Caesar sent the three, men who had explored the land. 

c. Caesar sent three men who were to explore the land. 

d. Caesar will send three men to explore the land. 

2. Dixernnt se Koma ad Galliam citeriorem quattuor diebus 

venisse. 

a. They say they will go from Rome to Hither Ganl in 
four days. 

b. They said that they went from Hither Gaul to Rome in 
four days. 

e. They said that they came from Rome to Hither Ganl 
within four days. 

d. They said that they had come to Rome from Hither 
Gaul in four days. 

3. Bxercitus, oppido capto, muros diruit et incolas in servitu- 

tem vendidit. 

a. After taking the town the army destroyed the walls and 
sold the inhabitants into slavery. 

b. The army will take the town, destroy the walls and sell 
the people into slavery, 

c. After the town had been captured the army demolished 
the walls and sold the inhabitants into servitude. 

d. When the town had been taken the army razed the walls 
and sold the inhabitants into slavery. 

4. Virtus magna hostibus erat, sed Romanos expellere non 

potuerunt. 

a. The enemy had great courage and the Romans could 
not drive them out. 

b. The enemy had great courage, but they could not drive 
out the Romans. 

c. The enemy are very brave, but they cannot expel the 
Romans. 

d. The Romans were very brave, but they could not drive 
out the enemy. 

5. Ita iter facient ut Corinthum die quinto perveniant. 

a. They are making a road to Corinth and will arrive in 
jSve days. 



MULTIPLE-ANSWEE TESTS 


313 


b. They will inarch so that they will arrive at Corinth on 
the fifth day. 

c. They marched so that they arrived at Corinth on the 
fifteenth day. 

d. They will march to Corinth and arrive there in five 
days. 

The next two examples illustrate plural multiple- 
answer tests of the type sometimes called classifica- 
tion or association. Each consists of one name, term, 
or expression followed by several others, some of 
which are more or less connected with it. In the first 
example, which is in general science, pupils are in- 
structed to underline the one or more of the terms 
within the parenthesis connected with the first term. 
In the second, which is an ancient history test, they 
are to cross out all suggested answers not connected 
with the first term. Ordinarily it makes little differ- 
ence whether the connected terms or those not con- 
nected are in some way indicated. 

Directions: Below are twenty words or phrases each fol- 
lowed by four others within a parenthesis. One or more of 
those within each parenthesis bears rather dose connection 
or relationship to the term outside the parenthesis. You are 
to indicate those which have this close relationdiip by under- 
lining each one of them. 

1. Necessities of life (air, clothing, water, food). 

2. Oxidation (rust, ashes, heat, water). 

3. Antiseptic (iodine, carbolic acid, hydrogen peroxide, 
glycerin). 

4. Eapidity of evaporation (temperature, air movement, 
light, air pressure). 

5. Bocks (shale, igneous, metamorphic, sedimentary). 

6. Organic nutrients (protein, salts, carbohydrate, fat). 

7. Tooth (bone, ename^ cement, dentine). 



314 TRADITIONAL EXAMINATIONS AND NEW-TVPE TESTS 


8. Porms of energy (heat, electricity, gravity, light). 

9. Durable woods (cedar, maple, cypress, chestnut). 

10. Distribution of heat (expansion, conduction, radiation^ 
convection). 

11. Fumigation (formaldehyde, alcohol, iodine, sulphur). 

12. Vaccination* (diphtheria, typhoid fever, smallpox, mea- 
sles) . 

13. Inner ear (drum, cochlea, auditory canal, vestibule). 

14- Respiration (diaphra^, auricle, ventricle, trachea). 

15. Pancreas (steapsin, bile, amylopsin, glycogen). 

16. Mammal (bat, whale, squirrel, turkey). 

17. Fungus (fern, mushroom, yeast, moss). 

18. Stratified rock (granite, marble, limestone, shale). 

19. Paint (linseed oil, olive oil, sperm oil, peanut oil). 

20. Soap (salt, fat, alkali, turpentine). 


Directions: In each set of six names or terms within a 
parenthesis one or more have no clo.se connection or rela- 
tionship with the term outside the parenthesis. Cross out such 
terms as are not related to the first term by drawing a line 
through each of them. For example, in the first exercise 
Cicero and Seneca are crossed out because they did not live 
in the so-called *‘Age of Augustus.'' 


1. Age of Augustus (€T eero > , Virgil, Ovid, Agrippa, Livia, 
-Sekeea-). 

2. Marathon (Darius, Mardonius, Themistocles, Pericles, 
Xerxes, Miltiades). 

3. Macedonia (Perseus, Mithridates, Hannibal, Philip, 
Olympia, Aristides). 

4. Fifth century b. c. (Thermopylae, Salamis, Issus, Cannae. 

- Marathon, Syracuse). 

5. Sparta (Demosthenes, Leonidas, Agesilaus, Cimon, Lycur- 
gus, Brasides). 

6. Ilgypt (Memphis, Karnak, Palmyra, Utica, Joppa, Cyzi- 
eum), 

7. Alexander the Great (Ptolemy, Antiochus, Parmenio, 
Roxana, Antipalter, Odoacer). 



MULTIPLE-ANSWER TESTS 


31S 


8. Eome (tribunes, ephors, censors, sediles, praetors, boeo- 
tarchs). 

9. Haumibal (Fabios, Brutus, , Scipio, Varro, Hamilcar, 
Graccbus). 

10. Battle of Arbela (Philip, P 3 Trhus, Xerxes, Alexander, 
Darius, Mithridates). 

11. Battle of the Metaurus (Hannibal, Hasdrubal, Hanulcar, 
Hanno, Nero, Fabius). 


The next test, which is in arithmetic, illustrates a 
somewhat different variation of this same general 
type. It consists of a number of sets of items or ex- 
pressions, there being ten in each set, and pupils are 
instructed to indicate the five largest (or smallest, if 
the teacher so desires) of each set. It will be seen that 
in all the sets except the third some preliminary work 
is required to determine the values of the various 
items. If, as is probably best, it is required that this 
be done mentally, the test is probably too difficult 
for any except the upper elementary grades and 
perhaps this is true even if some use of pencil and 
paper other than the mere indication of answers is al- 
lowed. 


Directions: You see on this page five lists each of which 
contains ten numbers or pairs of numbers connected by some 
sign. You are to determine the five numbers or pairs of num- 
bers in each column which are larger than any of the other 
five in the same column and m 2 irk them by drawing a ring 
around them. For example, if the 6 X 8 at the top of the 
first column is larger than five of the other combinations in 
that column, you should draw a ring around it. You do not 
need to determine the exact value of each term but merely 
whether it is larger than five other terms in the same column 
or not. 



316 TEADIHONAL EXAMIHrATIONS AND NEW-TYPE TESTS 


1 

2 

3 

4 

5 

6X8 

%2 

.0954 

% 

75-^3 

7X7 

% 

.1916 


52 — 2 

5X9 

% 

.3450 

Yz 

81 — Z 

4X12 

% 

.2000 


96^4 

5X11 

%6 

.0099 

% 

46^2 

6X9 

Ko 

.1111 


120-4-5 

7X8 

% 

.0090 

Vs 

108^4 

5X10 

%S 

.0199 

% 

132-4-6 

4X14 


.4821 

W24 

56-4-2 

3X15 


-0327 


175^7 


4. Compoiind multiple-answer tests. Compound mul- 
tiple-answer tests are sometimes included in the same 
class with plural multiple-answer tests but because 
of a rather fundamental difference it seems well to 
separate the two varieties. The term plural muJltiple- 
answer test has, therefore, been restricted to apply 
to that variety in which all of the suggested answers 
bear the same relationship to the exercise, that is, 
form a homogeneous group of possible answers from 
which one or more are to be selected on a single desig- 
nated basis. A compound multiple-answer test, on 
the other hand, involves the selection of answers on 
at least two bases. There may be a single homoge- 
neous group of answers from which one, or rarely 
more, is to be selected on a certain basis, another on 
some other basis, and perhaps even a third on a still 
different basis ; or there may be two or more groups 
of suggested responses from each group of which one 
or more answers should be selected on a certain stated 
basis, the basis differing for the different groups. The 
first variety is illustrated by the first of the following 
tests, which deals with American history. In this test 
are a number of groups of four items each. In some 



MULTIPLE-ANSWER TESTS 


317 


of these groups the items are the names of historical 
characters and in others of events. Pupils are in- 
structed to indicate in one way the one of each group 
of four characters who lived first, or the one of each 
group of four events which occurred first, and in an- 
other way the character who lived or event which oc- 
curred last. Following this is another example, like- 
wise from United States history, which illustrates 
the second variety of compound multiple-answer tests. 
It contains the names of a number of men most of 
whom took prominent parts in some war in which this 
country participated. Each name is followed by three 
blanks. On one of these pupils are to indicate in which 
one, if any, of several wars named each person parti- 
cipated. On another they are to respond so as to show 
whether he was a soldier or sailor, and on the third 
whether he fought for or against this coimtry. Follow- 
ing this test are two more examples of the same vari- 
ety of compound multiple-answer tests which illustrate 
different forms and possibihties thereof. 


Directions ; On the page below you will see twelve groups 
each of which contains the names of four prominent char- 
acters in American history, and eight groups each of which 
contains the names of four important events in AiYiftT-ioan 
history. Look over each group of four names or events and 
find the name of the character who lived first or the event 
which occurred first, and also the one in each group which 
was last. Place a “1” in front of the earliest character or 
event and a “2” in front of the latest one in each group. For 
example, if you had the group shown at 
the left you should place a “1” in front 
of Columbus because he lived first and a 
“2” in front of Grant because he was the 
latest. 


LaPayette 
Columbus 
Grant 

Ponce de Le6n 



318 TKJlBITIOHAL EXAMNATIONS AlTB NEW-TYPE TESTS 


Eric tlie Red Johm Cabot Cortez 
Columbus Francis Drake Prontenac 
Vasco de Gama William Penn Montcalm 
Magellan John Win- Lief Ericsson 
throp 


Daniel Boone 
Stephen A. 

Douglas 
John Hancock 
Peter Stuyve- 
sant 


Henry Clay W. H. Seward La Salle Andrew John- 

Jefferson Davis Samuel Adams De Soto sou 

John Adams John C. Cal- Paul Revere Zachary Tay- 
Benjamin houn General Wolfe lor 

Franklin Benjamin Har- Roger Wil- 

rison liams 

John Hay 


Jacques Car- James Monroe Captain John 
tier Patrick Henry Smith 

Citizen Genet Aaron Burr Pizarro 
Gen. Braddock Daniel Web- John Jay 
Ponce de Leon ster Tecumseh 


Brigham 

Young 

DeWitt Clin- 
ton 

Ethan Allen 
Elihu Root 


Battle of Long Island 
Siege of Boston 
Battle of Saratoga 
Battle of Trenton 


Conquest of Mexico 
Discovery of the Mississippi 
Pounding of Montreal 
Magellan's voyage around the 
world 


Settlement of Maryland 
Settlement of New York 
Settlement of Connecticut 
Settlement of North Carolina 

Battle of Gettysburg 
Battle of Antietam 
Emancipation Proclamation 
Second election of Lincoln 

Death of Hamilton 
Purchase of Louisiana 
Trial of Burr 
War with Prance 


Annexation of Texas 
Missouri Compromise 
Dred Scott Decision 
Death of John Quincy Adams 

Spanish- American War 
McKinley Tariff Bill 
Bryan's first campaign 
Cleveland's second election 

Our entry into the World 
War 

Deposition of the Czar of 
Russia 

Italy's entry into the World 
War 

Sinking of the Lusitania 



MULTIPLE-ANSWER TESTS 


319 


Directions : You will find below the names of ten men, most 
but not all of whom took part in wars in which this country 
was engaged, either before or after it became independent. 
Following each name are three blanks. On the first blank 
you are to indicate the war, if any, in which each of these 
characters took part. Do this by writing the name of the war 
on the blank, using one of the following four names : French 
and Indian, Revolution, 1812, Mexican. On the second blank 
after each name indicate whether the man was a soldier or 
sailor. Do this by writing the proper word, “soldier'^ or 
^‘sailor,'' on each blank. On the third blank after each name 
indicate whether the man named fought on our side or on 
the other side. Do this by writing S.” if he fought on 
our side, and ‘ ' enemy if he fought on the other side. To il- 
lustrate what you are to do, the blanks after the first name 
have been filled out. 

1. Bainbridge iS/2 ^ S, 

2. Brooke 

3. Gage 

4. Greene 

5. Harrison 

6. Marlborough 

7. Pakenham 

8. Santa Anna 

9. Tarleton . 

10, Taylor 

11. Wolfe 

Directions: You will find below a number of statements 
each of which contains two parentheses with several words 
ixx each. By selecting the proper expression from each pa- 
renthesis and using it in the sentence in the place where the 
parenthesis occurs a true sentence can be made. For example 
in No. 1 the selection of ‘‘protoplasm’’ from the first paren- 
thesis and “jelly-like” from the second makes the follow- 
ing true sentence: “The protoplasm, the primal material of 
life, is a jelly-lfise substance.” Therefore “protoplasm” and 
^‘jelly-like” have been underlined. Look through the other 
statements and underline the one word or expression in each 
parenthesis which will make them true. 



320 TRABITIOITAL EXAMINATIONS AND NEW-TYPE TESTS 


1. The (cell, protoplasm, molecule, atom), the primal ma- 
terial of life, is a ( jelly-like, relatively hard, liquid) substance. 

2. Irritability is the power of responding to a (pain, in- 
jury, stimulus, desire) by a (movement, sensation, response) 
of some kind. 

3. New cells originate from the (reproduction, division, as- 
similation) of (spores, ova, old cells, protoplasm). 

4. The lack of (phosphorus, salt, starch, calcium) causes 
the disease of (infantile paralysis, dementia, rickets). 

6. (Mendel, Galton, Darwin, Huxley) did most to develop 
the modern idea of (antiseptic surgery, evolution, entomol- 
ogy). 

6. The (Glacial, Jurassic, Devonim, Cambrian) period is 
in the (Mesozoic, Paleozoic, Cenozoic) era. 

7. A (phylum, family, genus, species) is the (largest, 
smallest, second) division of the animal kingdom. 

8. (Mammals, snails, spiders, sponges) belong to the (Coel- 
enterata, Annelida, Arthropoda). 

9. Sleeping sickness is caused by a variety of (round and 
flat, long and slender, spherical) (Porifera, Bryozoa, Proto- 
zoa, Phoronidea). 

10. The (Brachiopoda, MoUusca, Trochelminthes, Bryozoa) 
include (starfish, hookworms, oysters). 

11. The phylum (Protozoa, Vertebrate, Prochordata, Arth- 
ropoda) includes the most species of any, an example of it 
being the (scorpion, snail, sea anemone). 

Directions: You will see below the names of ten cities. 
Under each are the names of four states, four numbers, and 
the names of four rivers. Draw a line under the one of the 
four states named under each city in which it is located. Draw 
another line under the one of the four numbers which comes 
nearest to being its population according to the last census. 
Draw a third line under the one of the four rivers upon which 
the city is situated. Do not underline more than one state, 
one number, and one river for each city. 

1. Nashville 

North Carolina, West Virginia, Tennessee, Kentucky 
75,000, 90,000, 125,000, 150,000 
Cumberland, Tennessee, Ohio, Monongahela 



MULTIPLE-ANSWER TESTS 


321 


2. Little Bock 

Arkansas, Texas, Louisiana, Oklakoma 

25.000, 65,000, 110,000, 130,000 
Mississippi, Black, Arkansas, Bed 

3. Hartford 

Massachusetts, Vermont, Bhode Island, Connecticut 

140.000, 170,000, 200,000, 250,000 
Thames, Housatonic, Connecticut, Merrimac 

4. Spokane 

Oregon, Montana, Idaho, Washington 

75.000, 100,000, 125,000, 150,000 
Columbia, Spokane, Snake, Yellowstone 

5. Bismarck 

North Dakota, South Dakota, Montana, Wyoming 

10.000, 30,000, 50,000, 75,000 

Platte, Yellowstone, Mississippi, Missouri 

6. Kansas City 

Nebraska, Missouri, Iowa, Arkansas 

175.000, 225,000, 325,000, 450,000 
Mississippi, Arkansas, Platte, Missouri 

7. Frankfort 

Ohio, Kentucky, Tennessee, West Virginia 

5.000, 10,000, 25,000, 50,000 
Kentucky, Ohio, Tennessee, Cumberland 

8. Omaha 

Kansas, Missouri, Nebraska, Iowa 

75.000, 110,000, 145,000, 190,000 
Kansas, Missouri, Platte, Cedar 

9. Portland 

Oregon, Washington, California, Idaho 

110.000, 160,000, 210,000, 260,000 
Willamette, Columbia, Snake, Sacramento 

10. Harrisburg 

Maryland, New York, Pennsylvania, West Virginia 

30,000, 5O,O0O, 75,000, 100,000 

Allegheny,. Susquehanna, Monongahela, Delaware 



322 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

A dijfferent variety of test than any of those given 
in the last few pages, but one which appears to belong 
nnder compound multiple-answer tests rather than 
elsewhere, is illustrated by the material given below. 
Each exercise in this test consists of two sets of three 
words each. One word in each set of three is synony- 
mous with one word in the other set of three in the 
same exercise. 

Directions: After each of the numbers on the page below 
you will see two groups of three words each. Each group is 
enclosed in a parenthesis. Some one word in the first group 
means the same as one of the French words in the second 
group. Look at each two groups of words. Find the two words, 
one in each group, which mean the same and draw a line 
under each of them. For example after No. 1 the word '‘easy’’ 
in the first parenthesis means the same as "facile” in 
the second, so both these words have lines drawn under 
them. 

1. ( easy , hard, faculty) (gris, facile , gauche) 

2. (like, want, fear) (craindre, croire, battre) 

3. (physician, lawyer, merchant) (paysan, maitre, avocat) 

4. (late, soon, often) (bientot, autour, chez) 

5. (hit, grab, pull) (fumer, envoyer, frapper) 

6. (run, walk, rise) (revenir, marcher, sonner) 

7. (lose, hunt, sell) (chercher, peigner, partir) 

8. (want, hope, think) (appeler, aimer, penser) 

9. (sick, well, weak) (loin, malade, lent) 

10. (blow, word, set) (mot, inoise, midi) 

11. (hand, hat, glove) (guichet, fourchette, gant) 

5. Multiple-reason tests. Multiple-reason tests differ 
from other multiple-answer tests only in the fact that 
the suggested answers are reasons instead of facts 
bearing some other relationship to the exercises. Such 
tests may take any one of several forms though a mo- 
ment’s thought will reveal the fact that many of the 



MULTIPLE-ANSWEK TESTS 


323 


forms already illustrated are not appropriate for tHs 
variety. Only one form or arrangement is, however, 
very common. This consists of the statement of a fact 
under which are listed several possible reasons, causes, 
or explanations. 

Because of the close similarity in form of this vari- 
ety to a number of the examples already given in this 
chapter, only three illustrations will he given. In the 
first of these, which is in Latin, the reason for the con- 
struction of a particular word in a given sentence is 
.called for, there being only one correct reason in each 
case. The second, which is in home economics, calls for 
the indication of a single answer, but includes in the 
group answers of varying degrees of merit. The third, 
dealing with European history, presents groups of 
five answers, most of which contain two or more cor- 
rect ones, all of which are to be indicated by the pupils. 
It is, of course, possible to have multiple-reason ex- 
ercises of the compound variety also, but because the 
occasions for their use are rare and because they are 
similar in form to ordinary compound multiple-an- 
swer exercises, no especial illustrations of them have 
been added. 

Directions : You will see below ten Latin sentences. On the 
line below each sentence the form of some one word in this 
sentence is stated. This is followed by four possible reasons 
why the word is in that form. One of each set of four reasons 
is right and the other three are wrong. Place the letter found 
in front of the one reason of each four which is correct on 
the blank line in front of the number of the exercise. 

- 1. Vir adest ut suum amicum videat. 

videat is subjunctive, because it expresses 
a. result; b. purpose; c. indirect question; d. indi- 
rect discourse. 



324 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

2. CsBsar se ad Germaniam iturum esse dixit, 

se is accusative, because it is 
a, object of dixit; b. object of iturum esse; c. ob- 
ject of ad; d. subject of iturum esse. 

3. Duo homines quos vidimus ad Galliam ibant, 

Galliam is accusative, because it is 

a. object of ad; b. object of ibant; c. place whence; 

d. place where. 

4, Marcus et Tullus fuerant nostri amici. 

amici is nominative plural, because it is 

a. in agreement with nostri; b. object of fuerant; 

c. used after a form of sum; d. subject of fuerant. 

5- Scitisne hominem quern hodie videritis ? 

scitis ends in “tis’’ because it is 

a. perfect tense ; b. first person ; c. second person ; 

d. singular number. 

6. Milites Germani magnae magnitudinis erant. 

magnitudinis is genitive of 

a. description; b. possession; c. the whole; d. man- 
ner. 

7. Cur amici sui non venissent non scivit. 

venissent is subjunctive, because it is used to ex- 
press 

a. purpose ; b. indirect question ; c. doubt ; d. result. 

8. Eomam eontendamus ut ludos videamus. 

videamus is subjunctive, because it expresses 
a, purpose; b. command; c. wish; d. result, 

9- Gallis non erant arma bona validaque. 

Gallis is dative 

a. with compound verb; b. of indirect object; c. 
of direction; d. of possession, 

10. Tres legiones Romanae a Germanis superatae sunt. 

Germanis is ablative showing 

a. place from which; b. agent; c. means; d. place 

where. 

Directions : On the page below are ten principles or rules 
which you are supposed to have learned- Beneath each are 



MULTIPLE-ANSWER TESTS 


325 


four reasons why it is true. In most cases two or more of these 
reasons are true, but one out of each group of four is a bet- 
ter reason than any of the other three. Draw a line under 
the letter in front of the best reason of each four. 

1. A double boiler is used in cooking because: 

a. it reduces the possibility of the food burning or sticking. 

b. it saves fuel. 

c. it makes it unnecessary constantly to watch the food 
being cooked. 

d. it cooks the food more evenly. 

2. A fireless cooker is desirable because; 

a. there is praticaUy no danger of fire. 

b. food cooked in it tastes better. 

c. fairly uniform heat for a long time is secured at little 
cost. 

d. it reduces the time one needs to spend in the kitchen. 

3. Dishes should be rinsed in hot water because: 

a. it removes the soap. 

b. it makes them shiny. 

c. it saves washing with soap. 

d. it kills the germs. 

4. A vacuum cleaner should be used rather than a broom 
because : 

a. it is easier to use. 

b. it does not wear out rugs or carpets so rapidly. 

c. it is cheaper. 

d. it cleans more thoroughly. 

5. A household budget should be adopted because: 

a. it helps save money. 

b. it results in wiser expenditures. 

c. it shows one what to buy. 

d. it saves time. 

6. A young child should have his chief meal at noon be- 
cause : 

a. he is hungriest then. 

b. he may not have had a large breakfast. 

c. he will have a good opportunity to digest it. 

d. his sleep will not be interfered with. 



326 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

7. A kitchen floor should be covered with linoleum because : 

a. it is cheaper than other materials. 

b. it looks better than other materials. 

c. it wears better than other materials. 

d. it is easy to keep clean. 

8. The cold-pack method of canning is better than the old 
method because: 

a. it requires little fuel. 

b. it is easy to do. 

c. it makes the food more likely to keep. 

d. it preserves the flavor. 

9. Woolen clothing should usually not be washed in very 
hot water because: 

a. it does not get it clean. 

b. it makes it fade. 

c. it makes it shrink. 

d. it is hard on the hands. 

10. Hot water or steam heat is better than hot air heat be- 
cause : 

a. it produces a more even temperature. 

b. it uses less fuel. 

c- it requires less attention, 
d. it is cleaner. 

Directions : Below you will find ten sets of five reasons each. 
Some of each group of five are true reasons or explanations 
for the fact stated just above that group. Draw a ring around 
the letter in front of each true reason for each of the ten 
events stated. There is at least one true reason in each group 
of five and in most groups there are two or more which are 
true. 

1. Napoleon was defeated at the battle of Waterloo because : 

a. the heavy rain hindered his artillery from coming into 
action. 

b. the commanders opposed to him were better generals 
than he. 

c. the French troops were only half-hearted in their 
fighting. 

d. the opposing army , was composed largely of veteran 
soldiers. 



MULTIPLE-ANSWER TESTS 327 

e. liis army was much inferior in numbers to the oppos- 
ing one. 

2. James II was expelled from the throne of England and 
William and Mary called to it because : 

a. James II was a Catholic. 

b. James II was a Protestant. 

c. he had ruled too tyrannically. 

d. he was suspected of plotting to abolish Parliament. 

e. he was thought to be too much under Spanish influence. 

3. The capture of Constantinople by the Turks was due to 
the fact that: 

a. the Turks had an able commander and a large army. 

b. the Emperor Constantine was a coward and afraid to 
fight. 

c. a great many of the men of Constantinople were monks, 
so the fighting forces were small. 

d. a large party within the city openly aided the Turks. 

e. the city had no walls or other fortifications to aid in 
its defense. 

4. Mary, Queen of Scots, was beheaded because: 

a. the French encouraged Queen Elizabeth to do so. 

b. she was suspected of plotting to obtain the throne of 
England. 

e. she was a Protestant. 

d. the English wished to prevent her from becoming Queen 
of Scotland. 

e. Queen Elizabeth disliked and feared her. 

5. The defeat of the Spanish Armada was due to the fact 
that: 

a. it was ve]^ scantily manned. 

b. the Spanish ships were comparatively unwieldy. 

c. it suffered from unfavorable weather conditions. 

d. the French aided the English in fighting against it. 

e. the England had better naval commanders than the 
Spanish. 

6. The Germans were victorious in their war with Prance 
in 1870-71 because: 

a. their army was better drilled and prepared. 



328 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


b. their soldiers were miieh braver than the French 
soldiers. 

c. their generals showed more ability than the French 
generals. 

d. the French were not expecting war and were surprised. 

e. they had a much larger army. 

7. The empire of Charlemagne did not survive after his 
death because: 

a. by his will he divided it among his sons. 

b. no man of sufficient ability to dominate it appeared. 

c. it was poorly organized. 

d. the various portions of it were hostile. 

e. the name ‘‘Holy Eoman Empire^' was greatly hated. 

8. The battle of Hastings was won by the Normans because : 

a. their coming took the British by surprise. 

b. the British had recently been weakened by fighting the 
Danes. 

c. the Norman army was very much the larger. 

d. William the Conqueror was a very skillful general. 

e. many of the British fought on the side of the Normans. 

9. England and Scotland were finally united because : 

a. the English finally conquered the Scotch. 

b. the royal families of the two countries intermarried. 

c. the French refused to help the Scotch any longer. 

d. most of the people of both countries became members 
of the same church. 

e. the same man became heir to the thrones of both coun- 
tries. 

10. The Church of England broke away from the Eoman 
Catholic Church because: 

a. the English people were rather favorable to the Prot- 
estant Reformation. 

b. the preaching of John Calvin was very infiuential in 
England. 

c. Henry YIII was angry that the pope would not grant 
him a divorce. 

d. the pope attempted to dictate who should be King of 
England. 

e. Queen Mary was hostile to the Catholic Church. 



MULTIPLE-ANSWER TESTS 


329 


6. Multiple-description tests. This name is applied to 
that variety of multiple-answer tests in which the sug- 
gested answers are descriptions. As was stated con- 
cerning multiple-reason tests, so also those of the 
multiple-description variety may take a number of 
different forms, but only one of these is common. This 
is illustrated by the three tests given below. These 
differ in the same way as did the three in the last sec- 
tion. In the first there is only one really correct an- 
swer, the others containing incorrect statements. In 
the second are answers of varying degrees of merit, 
the chief difference being in the number of facts stated, 
though there are some errors present also. The third 
is of the plural type, that is, pupils are to indicate one 
or more correct answers as the case may be. 

Directions: Below are three descriptions of the Supreme 
Court of the United States, One of these is a correct descrip- 
tion as far as it goes whereas the other two are not. Eead 
these three descriptions and draw a line around the letter in 
front of the correct one. 

A. The Supreme Count consists of nine members who are 

elected by Congress and serve terms of six yeal’s after 
which they are eligible for re-election. The Court sits 
at Washington where it hears all of the most important 
cases in the country. There is no direct appeal from its 
decision, but an act of Congress can be passed which takes 
precedence over a decision of the Supreme Court. 

B. The nine members of the Supreme Court, one Chief and 

eight Associate Justices, are appointed by the President 
and serve for life or during good behavior. Most of the 
cases brought before this Court are appealed from the 
lower courts. Its original jurisdiction is limited to eases 
in which a state or diplomatic representative is a party. 
The decisions of the Court are, in a sense, more authorita- 
tive than the acts of Congress. In other words, the Court 



330 TKABITION'AL EXAJ^GN'ATION'S AND NEW-TYPE TESTS 

may annul an act of Congress by declaring it contrary 
to the Constitution, 

C. The President of the United States, with the approval of 
the Senate, appoints the six members of the Supreme 
Court. The term of ofSce is the same as that of the Presi- 
dent, although it is customary for succeeding Presidents 
to re-appoint members. The Court is presided over by the 
Chief Justice whose vote in rendering a decision counts 
the same as that of two other Justices. The eases brought 
before the Court are limited to those appealed from 
lower federal courts. 

Directions; You will find below four descriptions of the 
lungs. Read these and place a check mark (V) ^ front of 
the best description. 

A. The purpose of the lungs may be said to carry on the proc- 

ess of oxidation, which is giving oxygen to the red cor- 
puscles in the blood. Air is taken through the nose, phar- 
ynx, larynx, trachea and bronchial tubes into minute air 
sacks. On the surface of these sacks is a close network 
of capillary tubes and here oxidation takes place. At in- 
spiration the lungs expand in much the same manner 
that a rubber bag stretches and at expiration they con- 
tract again. 

B. The lungs are situated in the upper part of the thorax 

where they are protected by the ribs. They are composed 
of bronchial and blood tubes. Air is taken in through the 
throat. Through the thin walls of the tubes mentioned 
the air extracts impurities from the blood and then carries 
them out. 

C. The lungs are two in number, one on the right and one on 

the left of the heart. They are surrounded by a smooth 
membrane called the peritoneum. Inside of this membrane 
are large bags which are filled with air at inspiration and 
collapse at expiration. Their capacity is about 100 cubic 
inches and ordinarily this amount is breathed in and out 
about eighteen times a minute. The lungs are connected 
with the nose by the larynx, above which is the trachea. 



MULTIPUE-ANSWER TESTS 


331 


D. The function of the lungs is to aid the heart in its action. 
The fresh air is breathed in on every second heart beat, 
while expiration occurs on the alternate ones. The air 
taken in is purified by the cilia before it reaches the 
lungs. It is then passed around the heart to stimulate it 
in its action. This is accomplished by furnishing a supply 
of oxygen in which the heart works. The parts of the 
lungs are the pharynx, which is the long tube connecting 
them with the nose and mouth, the bronchial tubes, 
and the pleura, which are the thin-walled cells through 
which the oxygen is passed. 

Directions : Below are four short paragraphs each of which 
describes the whale. Read these four paragraphs and select 
the one or more which give true descriptions of the whale, that 
is, which do not contain incorrect statements. Draw a short 
line under the number in front of each description which you 
think is true. 

1. The whale is a large mammal which lives in the sea. In 

general, it is shaped like a fish, the chief point of dif- 
ference being that the tail is flattened horizontally rather 
than vertically. Rudimentary teeth, hind legs and other 
features of land animals are found. Whales feed chiefly 
upon animal food. 

2. The whale is the largest living fish. Sometimes it is as 

large as one hundred feet long and weighs about one 
hundred and fifty tons. The home of the whale is in the 
north, although occasionally one is seen elsewhere. It is 
hunted largely because of the whalebone which may be 
obtained from it. 

3. The largest living mammal is the whale, which has its habi- 

tat chiefly in northern waters. Instead of forelegs it 
has flippers and instead of hind legs mere rudimentary 
bones. There are several varieties of whales of which the 
sulphur-bottom is the largest and the sperm whale prob- 
ably the best known. The latter and also other varieties 
are hunted because of the oil which may be secured from 
them. Although whales can remain under water for a long 
time, they Smer from fish in teing obliged to come to 



332 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


the surface to renew their supply of air from time to 
time. 

4. The largest living mammal is the whale, which is to be 
found in the Pacific Ocean. It resembles a fish in appear- 
ance but not in habits. Practically all of its time is spent 
upon the surface of the sea rather than beneath the waves. 
Als o it frequents shores that it may feed on the vegeta- 
tion at the water’s edge. The whale is hunted because 
of the value of its skin for leather. One peculiar and 
interesting habit is that it spouts out through its nostrils 
water which has been taken in through its mouth. 


7. Summary. The multiple-answer type of the new 
examination is probably, everything considered, the 
type which should receive most common use. It is adap- 
ted to practically all subjects and phases of subjects. 
It is true that there is a slight possibility of pupils 
being confused by the suggestion of incorrect answers, 
that guessing is perhaps encouraged, and that pupils 
are not made to rely entirely upon their own initiative, 
but these are all minor disadvantages. In the construc- 
tion of such tests it is important that the incorrect an- 
swers be not too evidently so. By changing the incor- 
rect answers so as to mate them more or less similar 
to the correct ones, the teacher can regulate the dif- 
ficulty of multiple-answer exercises. The ordinary 
method of scoring such tests is simply to take the num- 
ber of right responses as the score, although for very 


careful work the formula Score = E — 


W 

N-1 


is prefer- 


able. Its use is not recommended for ordinary class- 
room testing, however. The varieties of multiple- 
answer tests dealt with in this chapter are the ordi- 
nary multiple-answer test, in which pupils select just 
one of the suggested answers to each exercise; the 



MULTIPLE-ANSWER TESTS 


333 


plural multiple-answer test, in wHeh they select one 
or more; the compound variety in which they select 
two or more hut on different bases; the multiple- 
reason test, and the multiple-description test. 



CHAPTER XII 
ALTERNATIVE TESTS 

I. General discussion. The expression alternative 
test is here used to include tests of various kinds which 
require pupils to choose between two possibilities in 
making their responses to each exercise or item, and 
also some varieties in which a third more or less neu- 
tral response may also be given. Therefore this chapter 
deals not only with true-false and yes-no tests, but 
also with ordinary multiple-answer tests which sug- 
gest only two answers. Sometimes all alternative tests 
are classed under the multiple-answer type but for 
several reasons it seems best not to do so. Alternative 
tests may take forms not possible for those of the 
multiple-response type which suggest more than two or 
perhaps three answers; they offer decidedly greater 
chances of guessing correct answers; largely because 
of this latter fact they are usually scored according to 
a different formula than is employed for most multiple- 
answer tests ; and, finally, in the minds of most 
teachers and pupils and in the discussions of most per- 
sons who have written concerning them, they have been 
thought of as a separate type. 

It is unfortunately true that the true-false state- 
ment has not only received much more use and atten- 
tion than any other variety of alternative exercise, but 
also has been the one type of the new examination to 
receive more emphasis than any other. This is un- 

334 



ALTERNATIVE TESTS 


335 


fortunate because the true-false test is not one of the 
few best types of objective tests. This latter statement 
applies also to all alternative tests though not with 
such great force as to the true-false variety alone. 
Instead of receiving the most frequent use such tests 
should, on the whole, be employed less often than those 
of the multiple-answer type with a larger number of 
suggested responses, of the matching type, the com- 
pletion type, and perhaps of others. They do, however, 
have sufficient merit that their use should not be en- 
tirely discontinued. Partly because pupils respond to 
a larger number of items per unit of time in true-false 
or yes-no form than is true for any of the other com- 
monly used types of objective tests, alternative tests 
may sometimes profitably be used to test knowledge of 
facts and details, ability to memorize, and so forth. 
Indeed, they can be so constructed and worded as to 
nieasure also knowledge of general principles alnd 
laws. Furthermore, because of their form of arrange- 
ment, they tend to measure critical ability of a certain 
sort which is frequently required in life outside of 
school. The average individual is very often confronted 
with two possibilities of which he must choose one. He 
must frequently vote for one of two parties or for 
one of two men ; he must decide whether he is in favor 
of or against a proposal or motion upon which he is 
to express an opinion or record a vote; he must, if a 
farmer, decided whether to use horses or a tractor, and 
so forth. 

On the other hand, as was implied by the first sen- 
tences of the last paragraph, alternative tests, espe- 
cially certain varieties thereof, have a number of dis- 
tinct disadvantages. There seems to be little doubt that 
certain varieties of alternative exercises, especially 



336 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

false statements, are somewliat more confusing than 
most other kinds of exercises, although this effect has 
often been overestimated. If they deal with material 
which has been fairly well mastered and if the errors 
made are carefully and impressively corrected, most, 
if not all, of the possible confusion of ideas is avoided. 
Eemmers and Eemmers (70), for example, present 
data which show that the after effects of true-false 
exercises need not be feared. Ballard (2, pp. 96-98) 
goes even further than this and gives a few data to 
show that greater improvement was shown on the false 
than on the true items of a certain test when pupils 
were retested following discussion of the results. 

Another charge frequently brought against alterna- 
tive tests of all varieties is that the guessing element 
is too great. Because the chance of guessing the correct 
answer is one out of two, pupils are often encouraged 
to make guesses. When such tests were first introduced 
it was recommended by McCall (49) and others that 
pupils be directed to guess upon all items to which they 
did not know or think they knew the answers. The con- 
sensus of opinion at present, however, appears to be 
against this practice and rightly so for at least two 
reasons. It is generally agreed that the development of 
the habit of guessing based on little knowledge is not 
a desirable outcome of instruction. Secondly, several 
studies, such as those of Euch and DeGraff (72 and 
74, pp. 71-74), show that scores are somewhat more 
reliable and valid if pupils are instructed not to guess 
than if they are directed to guess. 

Probably because of the considerable amount of 
difficulty involved in constructing true-false statements 
and yes-no questions to which one answer or the other 
is absolutely correct and which are in no way mislead- 



ALTEKNATIVB TESTS 


337 


ing, considerable attention has been given to the matter 
of how to formulate the statements or questions in 
such tests. Weidemann (gi) has gone into this matter 
much more carefully than has been done for the other 
types of the new examination. Most of the following 
suggestions and also a great many more may be found 
in his discussion. 

It is usually desirable to prepare a list of affirmative 
true statements. For a true-false test the proper num- 
ber of these can then be made false by the change, 
insertion, or removal of one or more words. If it is 
desired to use the yes-no form, these statements can be 
changed into question form and half of them made such 
that the proper answers are in the negative. The false 
statements or negative questions should commonly 
present either plausible misconceptions or absurdities 
such that affirmative responses indicate no real think- 
ing on the subject. Double negatives and usually even 
single ones should be avoided. Expressions such as 
“never,” “always,” “all,” and so forth, which Weide- 
marm calls “specific determiners,” are also undesirable 
because pupils frequently catch on to the fact that each 
one of such determiners tends to be used most often in 
the same kind of statements, that is, in either true or 
false ones, according to its character. Broad generaliza- 
tions, which are almost always false, and generaliza- 
tions with unnecessary modifiers, dependent clauses or 
qualifying phrases, which are usually true, are un- 
desirable. Statements or questions should not be very 
long. Compound sentences containing two or' more 
separate ideas should be avoided, or, if it is desired 
to include both ideas, broken up into two or more items. 

It has commonly been stated and assumed that half 
or approximately half of the statements on a true-false 



338 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

test should be true and the other half false, and simi- 
larly with the questions on a yes-no test. Directions 
frequently inform the pupils of this fact in the en- 
deavor to keep them from marking too many answers 
in either way. Tabulations of results made by a few 
investigators, among whom are Fritz ( 27 ) and the 
writer, show a definite and consistent tendency for 
pupils to respond “true” or “yes” more often than 
“false” or “no” in the case of exercises to which 
they do not know the correct responses. Fritz’ data 
indicate that approximately 62 per cent of such re- 
sponses are affirmative and 38 per cent negative, and 
therefore he recommends that instead of an equal num- 
ber of statements of each type, there be about six 
affirmative ones for each four negative ones. Presum- 
ably, if this ratio of guesses were to be maintained, 
it would still be necessary to inform pupils that the 
number of affirmative and negative statements or ques- 
tions was practically the same, which would be no 
longer strictly true. Moreover it is not clearly shown 
that the fact that pupils do guess one way more often 
than the other is a sufficient reason why there should 
be more exercises of one kind than of the other in- 
cluded in a test. A consideration of the effect of such 
guessing upon scores computed by the method recom- 
mended in the next few pages does not show that a 
different method is desirable because of this effect 
Therefore it is recommended that the commonly ap- 
proved and employed practice of having about an equal 
number of each of the two kinds of statements or 
questions be followed. 

A further objection connected with the matter of 
guessing is that no entirely satisfactory method of 
scoring which allows for guessing has been suggested. 

r 



ALTEESTATIVE TESTS 


339 


The question of how to score alternative tests will be 
considered a little later and what is believed to be the 
best method stated and defended, but it cannot be said 
to be entirely satisfactory either from the standpoint 
of securing accurate individual scores or from that of 
being accepted as entirely just by all pupils. 

Ordinary alternative tests may be given at the aver- 
age rate of about four or five exercises per minute 
though of course the difficulty of the material may 
change this one way or the other. Data presented by 
Ruch and DeGraff ( 72 ) indicate that those in regular 
multiple-answer form, that is, with two suggested an- 
swers other than “true” and “false,” or “yes” and 
“no,” are somewhat easier than true-false statements 
or yes-no questions in that pupils make higher scores, 
but there appears to be little difference in the amount 
of time required. 

There has probably been more contention and argu- 
ment as to the proper method of scoring alternative 
tests than on any other more or less detailed point con- 
nected with the new examination. Most of the discus- 
sion has urged that either the number of right an- 
swers alone be taken as the score or that the number 
of right answers minus the number of wrong answers 
constitute the score. A few persons have, however, 
proposed other methods. Under the discussion of scor- 
ing multiple-answer tests in the last chapter, the 


formula. Score = E — 


W 

N-1 


*, was given as the gen- 


eralized form for all such tests. It will be seen that 
substituting 2 for N in this formula reduces it to 


lit will be recalled that in this formula R stands for the number of rigrht 
answers, W for the number of wronjf answers, and N for the number of suegested 
or possible answers to each exercise or item. 



340 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

E — W, which is the usual form in which it is given in 
connection with alternative tests. The arguments ad- 
vanced for and against the use of this formula have 
been both philosophical and experimental, that is, 
based upon theoretical considerations and also upon 
the results of investigations and the tabulation of data. 
The evidence of the latter sort will be taken up first. 

On this, as on many other questions connected with 
the new examination, Euch has been one of the lead- 
ing investigators. In one of his earlier discussions of 
the matter (71, pp. 118-119), he presents evidence 
which points to the conclusion that any attempt to cor- 
rect for chance by subtracting the number of wrong 
answers is undesirable since it results in lower re- 
liability of the scores so obtained. Paterson and Lang- 
lie (66) also present data which tend to the same con- 
clusion. Later, however, Euch and DeGraff (74, pp. 
68-70, and 72) report and discuss results from a study 
embracing a much larger number of pupils and tests 
than were included in Euch’s earlier study. They con- 
clude that the use of the E — W method of scoring is 
preferable to taking merely the number right, because, 
although their more complete data likewise indicate 
that scores computed by the E — W formula are 
slightly less reliable than those found in the other 
manner, they show that such scores possess higher 
validity, and the gain here more than offsets the loss 
in reliability. Their later findings and conclusions are 
supported also by the studies of B. D. "Wood (98) and 
of E. P. Wood (99), both of whom found that the 
E — W formula resulted in more valid scores. It seems, 
therefore, that on the whole the experimental evidence 
is definitely in favor of the use of this method in 
preference to that of the simple number right. 



ALTEKNATIVE TESTS 


341 


Several other studies have indicated, however, that 
it is possible to secure slightly more accurate scores 
by the use of a different formula than that suggested 
above. Foster and Euch (25) found that the subtrac- 
tion of the number wrong from the number right ap- 
peared to be slightly too great a reduction in the scores 
to produce the highest validity. May (52) obtained the 
same result in even more pronoimced fashion and rec- 
ommends that only one-fifth of the number wrong be 
subtracted from the number right. The evidence of 
these two and a few other studies does not, however, 
appear to be sufficient reason for forsaking the use of 
the right-minus-wrong method of scoring. In the first 
place, they do not agree upon just what fraction of the 
number of wrongs should be subtracted from that of 
the rights. Secondly, the increased reliability and 
validity which they indicate for scores computed ac- ’ 
cording to their suggestions are so slight that they do 
not warrant the extra amoimt of computation required, 
at least for the ordinary purposes of the class-room. 

In connection with the formula Score = E — W, it 
caimot be denied that especially on short tests more or 
less injustice is done to particular individuals. If it is 
true, as this formula assumes, that in the long run 
right and wrong guesses occur with equal frequency, 
it follows that this formula more often yields a score 
which represents the number of right answers which 
the individual really knew than does any other sug- 
gested method of scoring. On the other hand, however, 
it does not yield this result in anything like a majority 
of individual cases on any particular test. For ex- 
ample, if a fairly large number of pupils make pure 
guesses, that is, guesses unaided by any knowledge 
whatsoever, upon ten items, it is most likely that about 



342 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

1 per cent of the group will guess nine right and one 
wrong, over 4 per cent eight right and two wrong, al- 
most 12 per cent seven right and three wrong, 20 or 
21 per cent six right and four wrong, 25 per cent five 
right and five wrong, 20 or 21 per cent four right and 
six wrong, 12 per cent three right and seven wrong, 
4 per cent two right and eight wrong, and 1 per cent 
one right and nine wrong. Thus it is evident that the 
chance of guessing the same number, five, right and 
wrong, is greater than the chance of any other combi- 
nation, but still is likely to happen in the ease of only 
one out of every four pupils. As Chapman (13), among 
others, has pointed out, this injustice to a majority of 
those taking each particular test must be admitted. 
In reply to this one may not only make the point just 
stated that any other method of computing the score 
would result in still more injustice, but also that in the 
long run, that is, in the averaging up of scores from 
a number of tests, the errors tend to balance one an- 
other so that it is unlikely that the total or average 
rating of any individual pupil upon a number of such 
tests is very unfair. 

Probably the chief of the theoretical or philosophical 
arguments against deducting the number of wrongs 
from that of rights is that, if a pupil answers correctly 
a certain number of the exercises on a test, he should 
receive as many points of credit as these exercises 
count. There are at least two answers to this argu- 
ment. One is to admit that he should receive credit 
for as many of his correct answers as he really knows 
are correct, but not for those which are mere lucky 
guesses. Most teachers and others interested appear to 
agree with this statement, but the reply is often made 
that we do not know how many of the correct answers 



ALTEEITATIVE TESTS 


343 


result from guesses. The rebuttal to this is in general 
the same as the point made in the preceding paragraph, 
that the assumption most likely to be true in any par- 
ticular case is that if the guesses were made without 
any knowledge concerning the items involved,® the 
number of right answers guessed is the same as the 
number of wrong answers. The second possible answer 
is to the effect that pupils should be penalized for 
wrong answers which they thought were right. A pupil 
who, for example, answers 25 items correctly and also 
answers 5 more which he thinks are correct, but which 
are wrong, should be penalized for these 5 wrong an- 
swers. In other words, his status with regard to knowl- 
edge of the subject is not as satisfactory as that of a 
pupil who also answers 25 items correctly but who 
knows that he does not know the other 5 and, there- 
fore, does not attempt to answer them. Thdt is to say, 
it is better for an individual to know that he is igno- 
rant of certain facts than to think that he knows them 
when he does not. Evidently one of these two causes, 
that is, either guessing or erroneous supposition of 
knowledge, accotmts for all, or practically all, wrong 
answers. 

In order to make clearer the points brought out in 
the preceding discussion it seems well to illustrate it 
by discussing several suppositious cases of pupils with 
certain amounts of knowledge concerning the items on 
a true-false test and the resulting scores which they 
should receive. These illustrations and the discussions 
thereof follow the trend which the writer has found 
most effective in presenting the justice and desirability 

2 It is usually true that guesses are to some extent guided by partial or in- 
complete knowledge and are therefore more likely to be true than false. This 
phase of the situation will be discussed two or three pages later. 



344 lEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

of the right-minus-wrong method of scoring to students 
in his own classes. They are based upon the supposi- 
tion that a number of pupils, who will be called A, B, 
C, and so forth, have taken an alternative test of 50 
items. 

To begin with the very simplest possible case, pupil 
A has responded to 30 items correctly and none in- 
correctly. In other words, he has responded to those to 
which he knows the answers and, because his disposi- 
tion is cautious and conservative or because he has 
obeyed the teacher’s instructions, has not made guesses 
on the others. There is, of course, no doubt as to what 
his score should be, and any of the proposed methods 
will yield him a score of 30. 

For the next illustration let us take pupil B who 
possesses exactly the same amount of knowledge as 
pupil A and therefore knows the correct answers to 
30 items of the test. B, however, is of a dijfferent tem- 
perament from A and, instead of letting the other 20 
exercises go unanswered, guesses upon them, although 
he knows nothing at all concerning the correct an- 
swers. As has been shown above, he is more likely to 
make 10 correct and 10 incorrect guesses than any 
other number of each. Most, if not all, persons will 
agree that the mere fact that B guesses on the items he 
does not know does not entitle him to a higher score 
than that received by A, who knew just as many as B 
but did not guess on the others. The B — W formula 
results in giving B a score of 30, the same as A, since 
from B’s total of 40 correct responses, 30 of which 
he knew and 10 of which he guessed, his 10 incorrect 
ones are subtracted. 

Let us next consider pupil 0, who is sinoilar to A 
and B in that he knew the answers to 30 items, but 



ALTEKNATIVE TESTS 


345 


differs in that he thought he knew the answers to 10 
more, although as a matter of fact he got these wrong. 
It seems scarcely justifiable to give C as high a score 
as A or B who possessed as much correct knowledge 
as 0 and were aware that they did not know the other 
items, whereas C incorrectly thought he knew 10 
others. Opinions may differ as to how much C should 
be penalized for his responses to these 10. The E — W 
formula, of course, assumes that one item erroneously 
thought to be known balances one that is correctly 
known and thus gives C a score of 20 (30 rights — 10 
wrongs). 

Pupil D presents a still different case. He knew the 
responses to 30 items and in addition had some general 
knowledge of the field covered by the others though he 
did not know absolutely the answer to any particular 
one of the other 20. However, he went ahead and an- 
swered all of them and, because of his general knowl- 
edge of the subject, succeeded in getting 15 of the 20 
answers right and only 5 wrong. Evidently D deserves 
some credit for the additional knowledge which re- 
sulted in 15 of the 20 being right in addition to that 
for the 30 which he knew perfectly. Probably the fair- 
est allowance is the excess of correct over incorrect 
answers, a result which is accomplished by the use of 
the E — W formula. In other words, pupil D’s score 
is 45 (that is, 30 + 15) — 5, or 40, of which 10 points 
is the allowance for the amount of knowledge he pos- 
sessed concerning the last 20 items. 

As a matter of fact most of the actual eases which 
occur will be more or less similar to the last one cited, 
that of pupil D, with perhaps the additional element 
that the pupils think they know the answers to a few 
items to which they do not. Most pupils know the cor- 



346 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

rect answers to a considerable number of the items on 
a test, have enough knowledge concerning a number of 
the others so that their responses are not pure guesses 
but are more likely to be right than wrong, and think 
that they know a few of which they are ignorant. The 
right-minus-wrong formula gives such pupils credit 
for all the correct answers which they really know 
plus additional credit for that knowledge which results 
in more right than wrong guesses with a deduction for 
those items to which they think they know the answers 
but do not. Everything considered, this appears to be 
the best method of scoring entirely apart from any 
considerations of reliability, validity, and so forth. 

A possible way out of the difficulty of making the 
proper allowance for guessing has been offered by 
Christensen (14). His suggestion is to the effect that 
every true-false or yes-no test be followed by another 
in multiple-answer form dealing with exactly the same 
items of information as did the first. A pupiL’s score 
would consist of the number of items or exercises to 
which he responded correctly upon both of the tests. 
This would reduce the chance of guessing correctly to 
a negligible amount. For example, if the multiple-re- 
sponse test contained four suggested answers, there 
would only be one chance out of eight that a pupil 
would guess right upon the corresponding item in 
both tests. Christensen’s suggested procedure has some 
merit and its occasional use by teachers is recom- 
mended. On the other hand, it does not seem at all 
necessary or desirable to employ it whenever an alter- 
native test is used. The chief reason for not doing so 
is that it requires more than twice as much time to 
test knowledge of a given number of items as does a 
single alternative test. 



ALTERNATIVE TESTS 


347 


One minor point which has not yet been considered 
concerns the question of what score a pupU should re- 
ceive in case he has more incorrect than correct re- 
sponses. Such a result springs from one of two causes 
or a combination thereof. By far the more frequent 
one is that, according to the laws of chance, the guesses 
of some of the pupils who know practically nothing 
about the items on the test are more often wrong than 
right. The second and less frequent cause is the fact 
that what some pupils think they know but do not more 
than balances what they really do know. Partly because 
chance rather than actual status of knowledge is the 
more common cause, and partly because of the rather 
general dissatisfaction with negative scores even 
though they be logically justified, it is here recom- 
mended that no negative scores be assigned on alter- 
native tests, but that zero scores be given to those 
pupils who have more incorrect than correct responses. 

2. True-false exercises. Undoubtedly the most com- 
mon form of alternative tests is that composed of a 
series of true-false exercises, that is, statements each 
of which is to be marked to show whether the pupil 
thinks it is true or false. As has already been indicated, 
the writer does not believe that this type of testing 
material should receive the degree of prominence here- 
tofore accorded it, though, on the other hand, he does 
approve its occasional use in high school and perhaps 
the upper elementary grades. In the discussion found 
in the last few pages, a number of suggestions concern- 
ing the construction of such exercises have been made 
so that nothing further along this line will be added 
here. 

There are a number of minor variations which may 
be followed in the form of presenting true-false ex- 



348 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

ercises or statements. In the first place, three different 
sets of responses are rather commonly used. These are 
the words “true” and “false,” the initial letters of 
these words, “t” and “f,” and sometimes the signs 
“4-” and “ — the former indicating true and the 
latter false. These may, of course, either be written 
in by the pupils or already included on the test sheet. 
The first of the two illustrative tests given is of the 
latter sort, that is, the words “true” and “false” are 
included immediately in front of each statement, and 
pupils are to underline the one which applies in each 
case. The second example calls for the use of the signs 
and “ — ” and for these to be written in by the 
pupils upon blank spaces provided therefor. 

Directions: Below are ten statements, about half of which 
are true and the remainder false. Bead each statement and if 
you think it is true, draw a line under the word “true” in 
front of it; if you think it is false, draw a line under the 
word “false.” Do not guess, that is, do not mark either “true” 
or “false” unless you feel sure that the statement is either 
true or false. 

True False 1. An Axminster rug is usually more expensive 
than a Wilton rug. 

True False 2. The use of soda in laundry work lessens the 
expense for soap. 

True False 3. An enameled top is ordinarily better for a 
kitchen table than a wood or glass top. 

True False 4. The purpose of using bluing after washing 
clothes is to keep them white. 

True False 5. Striped wall paper is frequently desirable in 
a room with a low ceiling. 

True False 6. If the floor of the kitchen is of wood, it should 
be thoroughly waxed. 



ALTERNATIVE TESTS 


349 


True False 7. Copper screening is not as satisfactory as gal- 
vanized iron screening. 

True False 8. Porcelain fixtures should be cleaned with 
gasoline. 

True False 9. A composition floor should be kept clean with 
an oiled mop. 

True False 10. A bedroom should have soft, neutral colors on 
its walls. 


Directions : Below are a number of statements dealing with 
facts you have learned in general science. About half of them 
are true and the other half are not true. You are to read each 
statement and place a plus mark (+) on the short line in 
front of it if it is true, and a minus sign(— ) on the line if it 
is not true. To show you just how this is to be done, the first 
two sentences have been marked. The first one is true so it 
has a plus sign in front of it. The second is not true so it 
has a minus sign in front of it. Do not guess, but if you do 
not think you know whether a statement is true or false, omit 
it and go on to the next one. 


±_ 1 . 
Zl_ 2, 
3. 

4. 

5. 

6 . 

7, 

8 . 

9. 

10 . 


Steam may be defined as "water vapor at the boiling 
point. 

A prism separates a sliaft of white light into eight 
distinct colors. 

The cost of producing a given amount of light with 
a tungsten electric bulb is less than with a gas ‘ 
mantle. 

Most of the evergreens have soft wood. 
Pasteurization of milk refers to killing the bacteria 
therein by freezing them. 

Steam engines are lighter than gas engines which 
produce the same amount of power. 

Wool is good for winter clothing because it is a 
poor conductor. 

When air is heated it expands. 

The earth is larger than either Mars or Venus. 
Granite is a relatively durable rock. 



350 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

11. The speed of sotmd through air is about 186,000 

miles per second. 

12. Water often carries the disease bacteria of typhoid 

fever. 

3. Yes-no questions. This variety differs from the 
one just described only in having the exercises placed 
in interrogative rather than declarative form. On the 
whole it seems preferable to the true-false form as 
being probably slightly less confusing and less liable 
to result in leaving erroneous impressions. Especially 
in the elementary grades it is better to employ yes-no 
questions rather than true-false statements when this 
general type of test is to be used. It is not common to 
mahe use of any other mode of response than the words 
“yes’' and “no,” though occasionally other methods 
are employed. 

Directions: Below are ten questions dealing with things 
you have studied in chemistry. In front of each question you 
see the words “yes” and “no.” Bead each question and under- 
line the one of those two words which is the correct answer. 
Unless you think you know the answer do not underline 
either “yes” or “no,” but skip that question and go on to 
the next one. 

Yes No 1. Is the rusting of iron a chemical process? 

Yes No 2. Will too much oxygen put out a fire? 

Yes No 3. Does the volume of a gas vary directly with the 
pressure, if the temperature is constant ? 

Yes No 4. Does water contain about % oxygen by weight? 
Yes No 5. Is air a chemical compound? 

Yes No 6. Is Cu the symbol for copper? 

Yes No 7. Does valence times combining weight give atomic 
weight ? 

Yes No 8. Is a compound of chlorine and an element called 
a chlorate? 

Yes No 9. Are water and carbon disulfide miscible? 

Yes No 10. Does sulfur have a marked odor? 



ALTERNATIVE TESTS 


351 


The next test given illustrates a possible method of 
testing reading ability. Such a test should deal with 
content well known to the pupils so that it will not 
test range of information but merely ability to read 
and comprehend correctly. 


Directions ® : On this page you will see a list of questions 
about things with which you are familiar. Read each question 
and write the answer, either ^^yes^' or “no,'' on the blank line 
in front of the question. For example, the first question is 
“Does a dog have four legs?" The answer is “yes," so “yes" 
is written on the blank line in front of that question. Now 
look at the second question. “Do books like to read?" What 
is the answer to this? . . . you are right, “no" is the answer, 
so write “no" on the blank line in front of No. 2. Now go 
ahead and do the same with the other questions, reading each 
and writing “yes" or “no," whichever is right, in front 
of it. 


yea 

1. 


2. 


. 3. 


4. 

_ 

5. 


6. 


7. 


8. 


9. 



11. 

4 ^ 

12. 


Does a dog have four legs ? 

Do books like to read ? 

Can men run faster than birds can fly? 
Do most trees have green leaves ? 

Can you see through a brick wall? 

Is your foot as long as your arm? 

Do you start to school before breakfast ? 
Are stoves usually made of iron ? 

Is your school built of sand? 

Do you drink water out of a glass? 

Do most people go to bed at night ? 

Can you see the sun at night? 


4. Other varieties of tests having only two possible 
answers. In addition to true-false and yes-no questions 
there are many other kinds of exercises which offer 
two and only two responses. Three tests will be given 

8 These directions should ordinarily he read hy the teacher but not fi^ven at 
the top of the test sheet. ^ 



352 TBABITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


to illustrate as many possibilities of this sort. The first 
is in one of the typical multiple-answer forms and, in- 
deed, may be thought of as a regular multiple-answer 
test with only two possibilities. As given here, the two 
suggested answers are arranged one above the other, 
but in many cases it might be better to have them in 
the more usual position of one following the other. 
Sometimes the directions provide for crossing out the 
incorrect one of the two responses, on other occasions 
for indicating the correct one. The second example 
presents a correct and an incorrect spelling of each of 
a number of words. A variation of this would be to 
give merely one form of each word using correct spell- 
ings in about half of the cases and incorrect ones in 
the other half. The third test consists of a number 
of statements which are to be marked to show whether 
or not they deal with the content of a certain sub- 
ject, in this case Eoman history, or not. 

Directions: At one place in each sentence below you will 
see two words, one of which is somewhat above the line of 
the rest of the sentence and the other somewhat below. You 
are to read each sentence and decide which one of these two 
words makes it a correct sentence, that is, which one is right 
to use in the sentence. After doing this, cross out the wrong 
word by drawing a line through it. Look at the first sentence 
“If a person lives here, he or they will hear many noises. 
Which is right, “he’’ or “they”? “He” is right, so “they” 
should be crossed out as is shown. Now read the second sen- 
tence and cross out the one of the two words which is wrong, 
then the third, and so on until you have done all of them. 

he 

1. If a person lives here, will hear many noises. 

he 

2. I am older than, . . 

jcnjuu 


have 


3. The ice and snow 


melted. 



ALTEKNATIVE TESTS 


353 


4. Let it there. 

he 

5. I a robin. 

«• He^the bdl. ' 

„ -rrr*'i'i - 

7. Will you us go? 

8. I have the ball? 

9. The men have home. 

10. John has home. 


Directions : Bach of the words on this page has been spelled 
in two ways. One of these is right and the other wrong. Look 
at the two ways of spelling each word and draw a line through 
the wrong one. 


1. again 

agana. 

6. -facto- 

know 

2. -ameung" 

among 

7. ment- 

meant 

3. buisy‘“ 

busy 

8. minute 

minit. 

4. choose 


9. peice- 

piece 

5. doctor 

doeter 

10. j^saddy. 

ready 


Directions: On this page are a number of statements deal- 
ing with facts and events of ancient history. Some of these 
facts and events were part of Roman history or had a close 
connection with Rome. Others were not in any way directly or 
closely connected with Rome. You are to place an ^^R'^ on 
the blank line in front of each statement which has to do 
directly with Roman history, and an ‘‘N’’ in front of each 
one which does not have to do with Roman history. 

1- Csssar defeated Pompey at Pharsalia. 

2. Hannibal won the battle of Cannae. 

3. Aristides, called ''The Just,^' was exiled by 

popular vote. 

4. Darius and Mardonius were defeated in 490 b.c. 



354 TRADITIONAL KXAMESTAHONS AND NEW-TYPE TESTS 

5. Numa Pompilius was reputed to have been a great 

law giver, 

6. Croesus, king of Lydia, was defeated and lost his 

kingdom. 

7. The Persians were defeated at the battle of Issus. 

8. The expedition sent against Syracuse almost totally 

perished. 

9. Pyrrhus, king of Epirus, was called in to aid the 

Greek cities. 

10. Mithridates, king of Pontus, attempted to form an 

alliance of many nations and tribes. 

5. Alternative tests which provide a third possible an- 
swer. The thought probably comes to the minds of most 
readers that it is impossible to have an alternative test 
with more than two possible responses, and perhaps 
this point of view is justified. It seems, however, that 
exercises such as are contained in the next three tests 
have much more in common with ordinary alternative 
exercises than those of the multiple-answer or any 
other type. The only way in which the first example 
differs from ordinary alternative tests is that it calls 
for pupils to indicate those exercises to which they do 
not know the answers. In other words, they are to make 
some response to every exercise, in this case indicat- 
ing whether the statement is true, false, or that they 
do not know. The second example presents a list of 
statements of three sorts. Some are true, others are 
false, and still others are sometimes true and some- 
times false. The third illustration is still slightly dif- 
ferent, It presents a number of pairs of words. In some 
cases the two in each pair are synonymous, in others 
they mean just the opposite, and in still others they are 
neither the same nor the opposite. It will be noticed 
that this, as well as the first of the three tests, is in a 
foreign language. Both varieties offer ready and use- 



ALTERNATIVE TESTS 


355 


ful means of testing certain achievements in such lan- 
guages, the one dealing with comprehension and the 
other with vocabulary. 

A number of advantages have been claimed for tests 
which present third possibilities of the sorts illus- 
trated. It is easily seen that the possibility of a cor- 
rect guess is reduced from one chance out of two to 
one out of three. A somewhat wider range of knowl- 
edge is probably covered and more thought and rea- 
soning required than by the ordinary alternative test. 

Directions: Look at each of the Latin sentences below and 
think what it means, that is, translate it to yourself. Some of 
the sentences are true and some are not true. Place a plus 
mark (-}-) on the blank line in front of each sentence that 
is true, and a minus sign (— ) before each one that is not true. 
If you do not know whether a sentence is true or false, place 
an in front of it. Mark every sentence as you go along 
in one of these three ways. 

1. Quattuor et sex sunt decern. 

2. Vir fortis e proelio exibit. 

3. Multi pueri et puellae canes amant. 

4. Britanni ad Italiam exercitum magnum miserunt. 

5. Omnes homines dim erant iuvenes. 

6. Gallinai duo pedes sunt. 

7. Grata est amicos videre et cum eis loqui. 

8. Feminae Bomanae erant suarum domuum dominae. 

9. Boves celerius quam equi currere possunt. 

10. Poeni tribus bellis Romanos superaverunt. 

Directions : Bach statement below deals with something you 
have studied in civics. Some of the statements are always true, 
some are always false, and some are true under some condi- 
tions but false under others. Place a plus sign (-]-) on the 
blank line in front of each statement that is always true, a 
minus sign (— ) in front of each one that is always false, and 
an in front of each one that is sometimes true and 

sometimes false. 



356 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


1. A child born in this country becomes a citizen. 

2. The referendum is a means of removing undesir- 
able ofiScials. 

3. Candidates are now nominated by direct primaries 

instead of conventions. 

4. Any voter may challenge the right of any person 

to vote. 

5. Postmasters are selected through Civil Service. 

6. A state assembly contains two houses. 

7. All bills for raising United States revenue must 

originate in the Senate. 

8. An amendment is not in force until ratified by 

three-fourths of the States. 

9. A new Congress meets every year. 

10. Supreme Court justices are appointed by the Presi- 
dent. 

Directions : Below are a number of pairs of French words. 
In some cases the two words in a pair mean the same. In 
some cases they mean just the opposite. In some cases they do 
not mean either the same or the opposite. If they mean the 
same, draw a ring around the ‘‘S’’ in front of them; if they 
mean the opposite, draw a ring around the if they 

mean neither the same nor the opposite, draw a ring around 
the^'N.’" 


SON 

1. achever 

vendre 

SON 

6. dormir 

se lever 

SON 

2. aide 

secours 

SON 

7. matin 

soir 

SON 

3. an 

annee 

SON 

8. monter 

descendre 

SON 

4. boire 

briser 

SON 

9. noir 

blanc 

SON 

5. craindre 

avoir 

S 0 NIO. orage 

songe 


peur 


6. Summary. Though alternative tests may he re- 
garded as merely a variety of multiple-answer tests, 
it seems desirable for several reasons to discuss them 
separately. On the whole they have received more 
emphasis and use than they merit, though they do have 
some worth. Among the possible undesirable outcomes 
of employing alternative tests are confusion of ideas 



ALTERNATIVE TESTS 


357 


and the encouragement of guessing. The exercises in 
such tests should be worded with great care and should 
ordinarily be about half affirmative and half negative. 
The question of scoring has received much attention 
and no unanimous agreement has been reached, but 
it appears from the evidence available that the use 
of the right-minus-wrong method is best. It is fre- 
quently somewhat difficult to convince pupils of the 
justice of subtracting the number wrong, but neverthe- 
less it is probably possible to do so in most cases. 
The working out of this method of scoring is illus- 
trated by several individual cases. In addition to the 
common true-false and yes-no types of exercises, al- 
ternative tests include those in regular multiple- 
answer form with only two responses suggested and 
those which have a third possibility of a more or less 
neutral sort, in addition to two which are definitely 
opposed to each other. 



CHAPTER XIII 
COMPLETION TESTS 

I. General discussion. The completion test is another 
of the familiar and frequently used types of the new 
examination. It consists of statements, sometimes 
single sentences and sometimes longer passages, with 
one or more of the words left out. These words are 
to be supplied by pupils either without other help than 
the given context or with such help as will be illus- 
trated by some of the examples in this chapter. In 
many ways completion tests resemble single-answer 
tests and are sometimes thought of as being merely a 
particular form therof. It is almost always relatively 
easy to change an exercise from one form to the other. 
They also overlap with some varieties of multiple- 
answer tests so that some of those already described 
under that heading might almost as well have been 
dealt with in this chapter instead. 

It is ordinarily somewhat easier to cover a whole 
thought or idea with the completion type of exercise 
than with the single-answer, multiple-answer, or any 
other type. That is to say, they can be so formulated 
that all of the important words used have a direct 
bearing on the correct answer. If the statements used 
are carefully prepared they appear to measure thought 
and reasoning ability to a somewhat greater extent 
than do most new-type tests. Furthermore they permit 
a considerable amount of initiative and freedom of 

358 



COMPLETION TESTS 


359 


expression. Pupils have relatively little chance of 
guessing answers unless they know so much about the 
point covered as to amount to fairly satisfactory 
knowledge thereof. 

On the other hand, completion tests possess several 
decided disadvantages. It is usually found that it re- 
quires more time and thought to prepare them so that 
they can he scored with a high degree of objectivity, 
that is, so that there will be no doubt concerning the 
correctness of all responses given by pupils. It is very 
difficult for a teacher to foresee all the possible impli- 
cations of the given portion of an incomplete state- 
ment, and it very commonly occurs that pupils give 
responses which are so near the borderline that it is 
difficult to decide how they shall be scored. However, 
if the same exercises are used from time to time, those 
which are found to be lacking in objectivity of scoring 
can either be modified so as to improve this quality or 
altogether eliminated. Chiefly because of this difficulty 
in securing high objectivity, more time is ordinarily 
required to score the same number of completion ex- 
ercises than is needed for multiple-answer, alternative, 
matching, or several other types. 

An argument sometimes urged against the use of 
completion tests in the school subjects is that they tend 
to be measures of general intelligence rather than of 
achievement in subject-matter. This argument un- 
doubtedly has some validity, but, on the other hand, 
it does not appear to be a sufficient objection to warrant 
discarding this type from the field of achievement test- 
ing. Its effect should rather be to warn the maker of 
completion tests that he must put forth unusual en- 
deavor to construct them so that the expressions to be 
supplied are evidently the direct result of instruction 



360 TEADITIONAL EXAMINATIONS AND NEW-T5rPE TESTS 


in tlie subject dealt with rather than of general intelli- 
gence or some other ability. 

The rate at which pupils respond to completion ex- 
ercises is somewhat slower than that for most of the 
other types of the new examination. If the exercises 
are single, relatively short sentences with one blauh in 
each, two or three per minute is about what may be 
expected. Thus it requires more time to cover the same 
number of items than is true with other forms. Scores 
ordinarily run somewhat higher than on single-answer 
tests, seemingly because the context which is supplied 
is more stimulating to thought. 

The most satisfactory method of constructing such 
exercises is to write out in positive form definitions, 
principles, laws, statements of fact, and so forth, and 
then to determine the words which are to be omitted. 
With younger children it is rarely desirable to omit 
more than one word in a sentence or perhaps some- 
times one word in a clause, since so doing is likely to 
make the exercises too diflSicult. The remark may also 
be made that it becomes somewhat harder to construct 
exercises which are satisfactory from the standpoint 
of scoring as the number of blanks in each sentence is 
increased. For high-school pupils two or three words 
may be omitted within a sentence or sometimes even 
within a clause, though even for them it is probably 
best not to do so very often. When two or more are 
omitted they should ordinarily not be immediately to- 
gether. The words chosen for omission should be rela- 
tively important. For example, a defibaition may be 
given with the word defined omitted, a statement of 
an act performed by some historical character with 
his name left out, and so forth. 

It is rarely desirable to copy sentences exactly from 



COMPLETION TESTS 


361 


a textbook or other source studied by the members of 
the class. The chief reason therefor is that pupils may 
make correct responses to such situations because of 
mere mechanical memory when they really have no 
understanding of the content dealt with. Of course, 
for pure memory work material may be exactly copied, 
but it is ordinarily better to use other forms of the 
new examination when the purpose in view is to 
test mere memorized facts. 

The instructions for completion tests should em- 
phasize the fact that pupils are to endeavor to find the 
best possible word for each blank. The directions 
should also make very plain the fact that one 
and only one word is to be written upon each blank. 
Therefore, if, as is rarely desirable, two words 
are to be supplied immediately together, two separate 
blanks should be provided for them instead of one 
long blank. 

The most common method of scoring responses to 
completion tests is simply to give one point for each 
correct response. Sometimes, however, two points are 
given for each entirely correct response, one for each 
response partially correct, and none, of course, for 
a response absolutely wrong. In case the test has been 
carefully constructed so that objectivity of scoring 
is high, the first method is recommended, that is, that 
one point be given for each answer judged correct, and 
nothing for any other responses. However, in the case 
of tests so poorly constructed that the correctness of 
many of the answers is doubtful, it is probably wise to 
use the second method and allow half as much credit 
for doubtful answers as for entirely correct ones. Ac- 
cording to any method of scoring it is the usual and 
best practice to allow the given number of points. 



362 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


whether it be one, two, or some other number, for each 
blank, and not for each statement or exercise. That 
is to say, each blank in a sentence containing two. or 
more blanks would count just as much as the blank 
in a sentence which contained only one. 

2. Simple completion tests. The simple or most com- 
mon form of the completion test is that in which there 
is no help of any sort other than what may be derived 
from the context given. This variety, therefore, re- 
quires more initiative and thought on the part of 
pupils than those in which answers are selected from 
a suggested list or help of some other sort given. It is, 
however, somewhat difficult to score with high objec- 
tivity. This type most commonly consists of a series 
of sentences in each of which one or sometimes more 
words have been omitted. In many instances, however, 
a connected paragraph or even a longer selection can 
be used with profit. Simple completion tests will be 
illustrated by four examples, two of which are com- 
posed of separate sentences, the third of a connected 
paragraph, and the fourth of words with one letter 
omitted in each. 

Directions: Below are a number of sentences in each of 
which one or two important words have been omitted, and 
blanks inserted where the words should be. Bead each sen- 
tence and write on each blank the word which you think 
makes the best and truest sentence. Do not in any case write 
more than one word on one blank. To make sure you under- 
stand what to do, the right word has been written in on the 
blank line in the first sentence. 

1. The type of insurance policy from which one receives the 

face value in cash at its maturity is called '^^**'^”*'*”*^ 

2. The ordinary abbreviation or symbol for this month or the 

present month is 



COMPLETION TESTS 


363 


3. The standard weight of a bushel of shelled corn 

is pounds in most states. 

4. If a person sends goods to another to sell for him, he is 

called the , and the seller the 

5. Income tax is paid upon one’s income, which 

is his income minus his exemption. 

6. The process of collecting the cash on a note before it is 

due is called the note. 

7. Before a check can be cashed it is necessary for the payee 

to it. 

8. The French system of measures, which applies to weight, 

length, capacity, etc., is called the system. 

9. The rate of interest to be paid on a loan upon which the 

rate is not specified is called the rate. 

10. The date on which a note falls due is called the date 

of 

11. A stock which sells above its face value is said to be 

above 

As has been suggested in connection with other types 
of the new examination, scoring is rendered easier if 
the responses are recorded in A straight column. Pro- 
vision for this is made in the next illustration, which 
is in the field of trigonometry. Short blanks are placed 
in the sentences where words have been omitted and 
longer blanks in front of the sentences for pu- 
pils’ actual responses. This type of arrangement is 
probably slightly more confusing to pupils, but the dif- 
ference is so little as not to be worth serious considera- 
tion. 

Directions: Below are twenty trigonometry statements, 
definitions, and formulae. In each one a word has been omitted. 
The place where this belongs is indicated by a short line. You 
are to write the word which needs to be inserted to make the 
best or truest statement on the longer blank line in front of 
each statement. Do not write tuore than one word on each 
line. 



364 TKADIIIONAIi EXAMINATIONS AND NEW-TYPE TESTS 

1. An angle of less than 90“ is 

called 


3. Logarithms with respect to base 10 

are called logarithms. 

4. Numerical values of trigonometric 

functions are called functions. 



6. The logarithm of a product is equal 

to the of the logarithm of its 

factors. 

7. A = l-f tan*A. 

8. The fractional part of a logarithm 

is the 

9. The logarithm of ^ is the of M. 

10. The. of an angle of 30° is .5. 

11. Sin*A + cos*A = 

12. = "V^s (s—a) (s-"b) (s-Kj) . 

13. The instrument used by surveyors 

to measure angles is a. 

14. The of A is the ratio of the 

opposite side to the adjacent side. 


15. 


sin A 


= A. 


16. The logarithm of One is always 

17. The cosine of any angle cannot be 

greater than 


18. If A is obtuse, sin A= 

A). 




COMPLETION TESTS 


305 


19. Two angles whose sum is 90° are 

called 

20 A= - . 

a 

Directions : The following paragraph is a brief general de- 
scription of certain features of our government with a number 
of the important words left out. A blank line is placed where 
each omitted word should be. Read the paragraph and fill 
in the proper word on each blank line. Do not in any case write 
more than one word on a single blank. 

There are three branches of our federal government, the 

the and the At the 

head of one branch is the President, at the head of the second, 

Congress, and at the head of the third the 

The President is assisted by his cabinet in which 

there are members. Congress consists of 

members, there being senators, 

from each state, and representa- 
tives, the number from each state depending upon the 

of that state. The Supreme Court consists of one 

justice and associate justices. Its 

members and also the cabinet officers are appointed by the 

The term of office of the President is 

years, that of a senator and that 

of a representative years. Supreme Court jus- 
tices ordinarily serve until , whereas cabinet of- 

ficers usually offer their resignations at the end of a 
^term. 

The next example illustrates an unusual applica- 
tion of the completion test. It consists of a number of 
single words which frequently give difficulty in spell- 
ing. One letter of each word is omitted. Pupils are to 
supply the missing letter of each. In constructing such 
exercises teachers should he careful that the words in- 
cluded are easily recognizable and that there is only 
one possible way of filling in each blank to form a real 



see TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


■word. For example, it would be undesirable to present 
sucb a form as 1-tter, which might be filled in so as to 
make either latter, letter, or litter. 


Directions: Below are twenty words in each of which one 
letter is left out. You are to write m the correct letter on 
the blank line in each. For example, if you had “prom- 
ia_aice,” you should write “e” in the word on the short 
blank line to make the word ^‘prominence.” Be sure to -write 
only bne letter in each word and to write the letter which 
makes the spelling of the word correct. 


1. rende_vous 

2. discern_ble 

3. rhino_eros 

4. eemet_ry 

5. endeav_r 

6. millin_ry 

7. priv_Jege 

8. ne_essary 

9. cauI_flower 
10. pne_monia 


11. combusti_ble 

12. dis_ension 

13. inflamm_ble 

14. s_ndicate 

15. ab_ss 

16. sover_ign 

17. .superinten d, n t 

18. mu_ilage 

19. countertJt 

20. per_eived 


3. Completion tests with suggested answers. The 
most common, probably the only satisfactory, means 
of providing pupils -with more help than the mere con- 
text is to accompany the completion exercises with a 
list of words from which all answers are to be selected. 
Sometimes this list consists only of the answers actu- 
ally to be used, arranged, of course, in a different order 
from that in which they are called for by the blanks. 
Such a list of answers is open to the objection that, if 
a pupil knows the responses to most of the exercises, 
he can by a process of elimination, with perhaps a 
little guessing, answer the others. It is, therefore, bet- 
ter to include in the list of suggested answers a num- 
ber of additional ones which are not 'correct for any 
of the blanks. Sometimes the exercises are so formu- 



COMPLETION TESTS 


36T 


lated that no word in the suggested list should be 
employed more than once, on other occasions they are 
such that some or all of the words therein may be used 
two or more times each. 

As was suggested in the preceding section, objec- 
tivity of scoring is almost always increased by giving 
a list of words from which all responses must be taken. 
With a reasonable amount of care such a list can be 
so constructed that it contains no words of doubtful 
correctness as answers for the given exercises, that is 
to say, each word therein is either unquestionably 
right or wrong for every blank. Because of this feature 
it is recommended by the writer that this variety of 
completion test be most frequently used and that the 
list of words contain a nxunber of wrong answers as 
well as all the right ones. 

The first example of this variety is a paragraph 
completion test in the field of economics. It contains 
fifteen blanks and a list of fifteen correct words with 
which these blanks should be filled. Following these is 
a sentence test in agriculture which is accompanied by 
almost twice as many suggested answers as there are 
blanks to be filled. The third test illustrates a possible 
use of the completion type in foreign language. As will 
be seen, it presents a paragraph of French with a 
number of words omitted and gives a list of possible 
answers contaiuing these and also a number of other 
words. 

Directions: You will find below a paragraph with fifteen 
words omitted and a blank line where each belongs. At the 
right of this paragraph is a list of the fifteen words arranged 
alphabetically. Bead the paragraph and write the proper one 
of these words on each blank. Do not write more than one 
word on each blank, and do not use any word more than 



368 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


once. It will be helpful for you to check each word as you 
use it to avoid the danger of using it again. 


Adam Smith named three advantages 
resulting from division of labor : the im- 
provement in the of the 

workman, the saving of lost 

in going from one sort of work to an- 
other and the application of proper 

, which reduces the work of 

the worker to a Two kinds 

of division of labor may be distinguished, 

and The 

former involves doing different kinds of 

work in places at the 

time, the latter involves do- 
ing different parts of a complete job in 

The former becomes, in one 

of its broader aspects, divi- 
sion of labor and gives rise to 

Still further we have divi- 

sion of labor resulting when the 
are different 

rather than parts of the same 


contemporaneous 

countries 

country 

dexterity 

different 

international 

machinery 

minimum 

same 

succession 

successive 

territorial 

territories 

time 

transportation 


Directions : Each of the ten sentences given below contains 
one or two blanks representing the omission of words. Prom 
the list of words given at the right select the one which be- 
longs on each blank in order to make the best or truest sen- 
tence and write it upon the blank. The list contains all of 
the correct answers and also a number of words which are 
not to be used. Do not use any word in the list more than 
once. It will help you to do this if you will check each word 
as you use it. 


1. Ponderosa is a variety of bacteria 

2. No feed should be given young chicks bran 

until they are days old. corn 

3. Finely ground mixed food fed to culling 
chickens either wet or dry is called dubbing 
eight 



COMPLETION TESTS 


369 


4. Separating laying hens from those that 

do not lay is called 

5. An animal that bears no evidence of 

good breeding is called a 

6. Partly decayed vegetable matter found 

in the soil is called 

7. The purpose of a dust mulch is to check 

the of from the 

soil. 

8. Clover and other leguminous plants 

gather from the air. 

9. In testing seed com ker- 

nels are usually selected from each ear. 

10. A potato weighing about one-half pound 

should be cut into pieces for 

planting. 


evaporation 

four 

humus 

lettuce 

loam 

mash 

mineral 

moisture 

nitrogen 

purebred 

scrub 

six 

tomato 

two 


Directions : This test consists of a French paragraph from 
which a number of words have been omitted. Each blank 
line shows where such an omission has occurred. You are to 
choose the proper one of the words in the list at the right 
and write it on each blank. This list contains all of the cor- 
rect words and also some extra ones which you are not to use. 


Nous a la campagne, ou nous 

un jardin. DL c6t6 il y a 
fleurs, de Tautre cote il y a beau- 

coup legumes. Parmi 

il y a des pommes de terre, des choux 

des carottes. On a plante 

pomin^s de terre mois 

d’avril et / uiirissent 

trois ou quatre mois. 


a 

au 

avez 

avons 

dans 

de 

des 

elles 

et 

eux 

habitons 

ils 

les 

que 

un 

une 

vivons 



370 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS , 

The next examples illustrate the type of test in which 
some or all of the words in the suggested list are to be 
used two or more times. Indeed, in the second example, 
which contains about twenty blanks and only three 
suggested words, each will be used considerably more 
than two times. In these two examples the given lists 
of words do not contain any which are not to be used 
but it is easily possible to insert some even though 
others are to be used several times each. 


Directions: Below are a number of statements concerning 
characters in ancient history with the names of the characters 
omitted. At the right is a list of eight characters. You are 
to write the name of the proper one of these eight characters 
on each blank line. You will need to use some of the naTncitf 
more than once. 


1. was one of the most fa- 

mous kings of the Spartans. 

2. The king of Persia who invaded Greece 

and burned Athens was 

3. The two leaders under whom Thebes 

rose to prominence were and 


4. The greatest statesman of Athens and 

the man who was largely responsible for its 
high development was 

5. The commander-in-chief of the Greeks 

at Thermopylae was 

6. was chielfly responsible for 

the Athenian policy of retiring within the 
walls when the Spartans invaded Attica 
every season. 

7. The last king of Persia before it was 
conquered by the Macedonians was 


Darius 

Epaminondas 

Gylippus 

Leonidas 

Mardonius 

Pelopidas 

Pericles 

Xerxes 


8. The Spartan general who went to 
S3u:acuse and was chiefly respbnsible for de- 
feating- -.the AtheniaU expedition was 



COMPLETION TESTS 


3T1 


9. The king of Persia who sent an expedi- 

tion which was defeated at Marathon was 
and the Persian general was 

10. The leader of the Theban Sacred Band 
when it expelled the Spartans was 


Directions: The paragraph given below contains about 
twenty blank lines where words have been omitted. In every 
case the word omitted is one of the three words *‘to/’ too/’ 
and ^‘two.” Read the paragraph and write the correct one 
of these three words in each blank. If you do not know which 
one belongs in a blank after thinking about it a little while, 
go on to the next blank. 

In a certain family there were .. _ boys and 

girls - The boys, John and Tom, l^ed 

tease the girls, which often caused them cry. Their 

father had told the boys not do ij;. any more 

.jfrrtjT- different times and was going punish 

them if he caught them doing it again. Their mother, 
^ 1 ^ ) had told the boys stop being 

rough with their sisters . On e day, when it was 

snowing and was very cold , the boys caught 

their sisters on the way „ - school and 

stuffed snow down their backs. When the girls came home 
they told their mother, who said sh^would punis^ the 
■ boys and tell their father - - - do so _ - 

So that evening the father took the boys out . the 

woodshed, where he gave them switchings, and their mother 
sent^hem bed early and without their suppers 

The next example illustrates a form sometimes used 
in order to save the time of the pupils. In it the sug- 
gested answers are numbered and instead of writing 
in the correct answers all that the pupils do is to copy 
the numbers in front of them. The saving in time is, 
however, so small that it cannot be said that this 



372 TRADITI0]!7AIi EXAMINATIONS AND NEW-TYPE TESTS 


method of recording responses possesses great advan- 
tage over the ordinary method of copying the entire 
word. 


Directions: This test consists of ten statements dealing 
with facts yon have studied in this course. One or more words 
have been omitted from each statement and their places taken 
by blanks. The words which have been omitted and also a 
few others are in the list at the right. Read each sentence and 
then select from this list the word which belongs on each 
blank. Write its number upon the blank. For example, if you 
jBnd that the first word, ammeter, is the right word for a 
certain blank, place a figure “1"' on that blank. 


1. A is slightly more than a quart. 

2. The of lead is 11.3. 

3. The indefinite expansibility of a 

is probably due to motion. 

4. The fact of capillary attraction may 

be explained by the theory. 

5. The force of decreases as the 

square of the distance increases. 

6. If the is decreased the boiling 

point is lowered. 

7. The efficiency of electric lamps is 

measured by per candle power. 

8. The unit of rate of electric current is 

the 

9. The usual device for measuring 

strength of current is an 

10. is a better electric conductor 

than steel. 


1. ammeter 

2. ampere 

3. atomic 

4. copper 

5. density 

6. gas 

7. gravity 

8. liter 

9. meter 

10. molecular 

11. ohm 

12. pressure 

13. rotation 

14. solid 

15. velocity 

16. watt 

17. weight 

18. zinc 


4. Other varieties of completion tests. In addition to 
providing a list of words from which to select the 
responses, there are two other methods which have 
sometimes been used to give pupils some assistance 
and also to render scoring more objective. One of these, 
which is illustrated by the first example, consists in 
supplying at the beginning of each blank the initial 



COMPLETION’ TESTS 


373 


letter of the word which should be written upon that 
blank. The second provides in connection with each 
blank a number which indicates how many letters there 
are in the proper word. Although, as was stated, these 
methods have occasionally been employed, they appear 
to have little to recommend them in preference to a 
list of . possible answers. They are illustrated here 
rather for the sake of completeness than with any de- 
sire to suggest their use. 

Directions : Bach blank in the sentences below is to be filled 
in with a word of which the first letter is given at the begin- 
ning of the blank. Bead each sentence, try to think of the 
correct word for each blank beginning with the given letter, 
and when you are able to think of such a word, write it on the 
blank. 

1. A number represented by a letter is called a 1 

number. 

2. An expression whose parts are not separated by plus or 

minus signs is a m 

3. A number written at the upper right of a quantity to 

show how many times it is used as a factor is an e 

4. An equation whose members are equal for all values of 

the unknowns involved is called an i 

5. A number into which another may be divided with no re- 
mainder is called its m 

6. A fraction with one or more fractions in either or both 

of its terms is a c ^ fraction. 

7. In graphing, the horizontal base line is called the 

X-a 

8. The indicated square root of a number not a perfect 

square is known as a q s 

9. The indicated square root of a negative number is called 

an i^ - - - number. 

10. When the product of corresponding values of two vari- 
ables is constant they are said to vary i to each 

other. 

Directions: Bach of the blanks in the statements on this 
page should be filled in with a word which has the number of 



374 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


letters indicated by the miinber just below the blank. Bead 
each sentence and try to find a word for each blank which 
makes the sentence true and has the right number of letters. 
Do not place more than one word in each blank and do not 
write any word unless it has the correct number of letters. 

1. One can make purple by mixing red and 

4 

2. A line from the center of vision to the 

8 

point is called the line of vision. 

3. is the complementary color of red. 

5 

4. A color produced by mixing equal quantities of two 

colors is called a — - color. 

9 

5. The colors of nature need not be used for a 

10 

landscape. 

6. Bed, yellow and blue are usually considered the 

and colors. 

11 7 

7. Too many pictures hung in a room are liable to destroy 

the elements of the room. 

10 

8. Posters are usually done in a flat man- 

10 

ner rather than through the use of 

11 

9. The fundamental principles of design are usually con- 
sidered to be , , and 

6 7 7 

10. Barns and other outbuildings should be painted so as 

to with the 

9 5 

5. Summary, The completion test is also another of 
the commonly used types of the new examination. It 
probably tests knowledge of a complete thought better 



COMPLETION TESTS 


376 


than most of the other kinds of objective tests. Al- 
though it is somewhat difficult to construct so that scor- 
ing is highly objective, and has certain other dis- 
advantages, yet it merits rather wide use. Complete 
statements should be formulated and one or more im- 
portant words omitted from each. The statements may 
be single sentences, connected paragraphs or longer 
passages. Pupils should be instructed to find the best 
possible word for each blank. In scoring responses the 
ordinary method is to count them as either right or 
wrong, though sometimes half credit is allowed for 
those which are partially correct. Scoring can be made 
more objective by providing a list of answers from 
which all responses are to be taken rather than leav- 
ing their selection entirely to pupil initiative and, 
therefore, this practice is recommended as most desir- 
able. Sometimes the initial letter of, or the number of 
letters in, each word is supplied, but this is not highly 
recommended. 



CHAPTER XIV 
MATCHING TESTS 

I. General discussion. The essential characteristic of 
a matching test is that it presents two sets, or some- 
times more, of expressions or lists of items of some 
sort and asks the pupils to match some or all of those 
in one set with some or all of those in the other. One 
list may he composed of a number of dates and the 
othei; of a number of events, one of Latin words and 
the other of their English equivalents, one of cities 
and the other of the states in which they are located, 
and so on almost without end. There is almost no sub- 
ject and no phase of a subject in which a satisfactory 
matching test cannot be constructed. Although such 
tests are usually employed to test association of pairs 
of simple facts, they can also be made to cover broader 
thought processes. For example, one list of items may 
consist of portions of a number of definitions or de- 
scriptions, and the other of the remaining portions of 
these definitions or descriptions. 

Few arguments have been advanced against the use 
of matching tests. It has been said that guessing may 
play too great a part in the results, but its effect can 
be rendered negligible by including a number of items 
in one list which do not match any of those given in 
tile other. 

One of the most important points in connection with 
constructing matching tests is the determination of the 

376 



MATCHING TESTS 


377. 


mirnber of items included. The writer has seen such 
tests in which there were from twenty-five to fifty 
items in each list. When the number is anything like 
this great, pupils usually waste a great deal of time in 
merely looking up and down the list to find the proper 
items. On the other hand, if the lists are extremely 
short, the possibility of guessing referred to above 
cannot be avoided. It is, therefore, recommended that 
lists should not contain less than ten items nor more 
than twenty. 

The arrangement of items in one of the lists should 
be such that it is random with respect to those in 
the other list. Frequently alphabetical, chronological, 
or some similar order accomplishes this purpose- very 
satisfactorily and also renders it easier for pupils to 
find the correct items. For example, if a test is given 
in which one list is composed of names of authors and 
the other of titles of their works, one or both may well 
be arranged alphabetically. Thus when a pupil sees 
the title of a work he will have no difiSculty in finding 
the name of the author at once if he knows it. 

A method of increasing the difficulty of matching 
tests and also of covering a wider range of knowledge 
is to have more than two sets or lists of items to be 
matched. As will be shown by some of the examples 
near the end of this chapter, one may without great 
difficulty construct such tests in a number of subjects. 

Another but not so satisfactory method of increas- 
ing the difficulty of matching tests consists in having 
two or more items in one column which may be cor- 
rectly matched with a single item in the other column, 
but such that a particular one of them must be matched 
with this item in order that the others may be matched 
with other items. For example, if pupils were asked to 



378 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

match names of generals with the battles in which they 
took part, one list might contaia the name of Gates 
and the other Saratoga and Camden, in both of which 
he participated. Arnold, who took part in the former 
hut not in the latter of these battles, might also be 
named, thus rendering it necessary to match Camden 
with Gates and leave Saratoga for Arnold. Such a 
complication as this should probably not be introduced 
in matching tests for elementary school pupils, cer- 
tainly not for those in the lower grades, but in high 
school it is occasionally in place. 

If matching tests are carefully constructed the scores 
yielded will possess practically perfect objectivity. In 
this feature, therefore, they rank as high as any type 
of the new examination. 

The ordinary method of scoring pupils’ responses to 
matching tests is merely to coimt one point for each 
pair of items correctly matched. It has been suggested 
that some deduction should be made for those incor- 
rectly matched. Although this proposal is more or less 
logical, especially in view of the recommended method 
of scoring alternative tests in which the number of 
wrongs is subtracted from the number of rights, yet 
it has not come into common acceptance and is not 
recommended here. 

The writer’s general recommendation as to the use 
of matching tests is that they be employed with de- 
cided frequency. He would certainly rank them among 
the two or three types to be used most often in most 
elementary and high-school subjects both because of 
their comparative ease of construction and scoring, and 
because of the functions they fulfill. 

2. Ordinary matching tests. Probably the best gen- 
eral classification of matching tests is into two types 



MATCHING TESTS 


379 


■which are here called the ordinary and the compound. 
The ordinary or simple type includes merely two lists 
of items whereas the compound includes three or more. 
In the next few pages seven illustrations of dijfferent 
varieties and possibilities of the simple type are given. 
The first of these, which is in arithmetic, and the sec- 
ond, in biology, present lists of equal length in which 
all of the items are to be used. 

Directions : On the page below you see two lists of numbers. 
The first list, each of which has a letter in front of it, is com- 
posed of integers or mixed numbers reduced to lowest terms. 
The second list is composed of fractions which are not in their 
lowest terms. Each of the numbers in the first list represents or 
is the same as one of the fractions in the second list reduced 
to its lowest terms. You are to look over the lists, find the num- 
ber equal to each fraction, and write the letter which is in 
front of that number right after the fraction. For example, 
the first number in the first column is It is equal to the 
fraction reduced to its lowest terms,, so an "a” should 

be written right after “Hi.” You see this has been done. (Jo 
ahead and do the others in the same way. 


a. ya 

”/4 

b. IH 


c. 2 


d. 3% 

%4 A . 

e. 3% 


f. iV* 

^72 

g- 6% 


h. 


i. 6% 


j. 8 


k. 8% 

"%2 


Directions: Below you see a list of twelve numbers and 
twelve letters. After each number is the first half of a state- 
ment having to, do ■with biology and *^er each letter the sec- 
ond half of such a statement. If the halves after the numbers 
and those after the letters are properly matdied up, they ■will 



380- TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


make twelve true statements. Read the portion of a statement 
after each number and find the portion after a letter which 
completes it so as to make a true statement. Write the letter 
in front of the last half of the statement on the short blank in 
front of the first half of the statement. For example, No. 1 is 
‘^Most normally green plants lose their color and this can be 
completed to make a true statement by adding what is found 
after '4’Mn the second column, ‘‘when grown in the dark.’' 
Therefore “i” has been placed on the blank line in front of 
No. 1. 


I 1. Most normally green plants a. reproduction by bud- 
lose their color ding and producing 

eggs. 

2. Water vapor escapes from b. through their stomata, 
plants 

3. The common characteristic c. contracts into a round- 

of flowering plants is ed mass. 

iL- 4. Almost all plants which d. their responses to stim- 
form coal uli. 

JL 5. The most conspicuous ani- e. the easting of the shell, 
mal characteristic is 

_£i 6. When an expanded amoeba f. the ability to move 
is strongly stimulated it from place to place. 

JL, 7. The greatest difference be- g. a division into head, 
tween the highest and low- thorax and abdomen, 

est animals is in 

Ja 8. The characteristic of hy- h. the ability to evert its 
dra is stomach. 

.A, 9. The starfish possesses i. when grown in the 

dark. 


.^10. The general structure of j. the formation of a re- 
* insects is productive body. 

JL 11. The mammals usually k. are now extinct, 
possess 


JL12. Bcdysis is the name given 1. a covering of hair. 
' - to 



MATCHING TESTS 


381 


The next illustration, which is in German, is the 
same as the last two except that it illustrates the com- 
plicating feature referred to in the general discussion, 
that is, it contains cases in which either one of two 
items in one column is a correct match for one in the 
other, but, because of other items in the second column, 
only one of the two can be used. 


Directions: On the page below are twelve German words 
and twelve descriptive terms. One of these terms belongs with 
each of the words. Look over the lists and place the number 
of the correct word in front of the descriptive phrase with 
which it belongs. 


masculine noun 
personal pronoun 
infinitive of verb 
possessive pronoun 
neuter noun 
past participle of verb 
feminine adjective 
adverb 

pronoun used as direct object 
past tense of verb 
modal auxiliary verb 
demonstrative pronoun 


1. wasser 

2. gegangen 

3. dies 

4. sie 

5. stand 

6. gut 

7. Lehrer 

8. haben 

9. leichte 

10. kommen 

11. den 

12. sein 



The next three examples illustrate a variety which 
in general is to be preferred to that already shown. 
In each of these there are more items in one column 
than in the other, so that the last few cannot be 
matched by a process of elimination. 


Directions : On the page below you will see the names of ten 
books, stories, or poems, and also of fifteen authors. Look 
over the list of authors and find the name of the one who 
wrote each book, story, or poem. Place the number which is 
before the author's name just in front of the name of his 
work. Dio not use the name of any author twice. 



382 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


Hamlet 

Idylls of the King 

‘‘Legend of Sleepy Hol- 
low’^ 

House of Seven Gables 
SUas Marner 
Ivanhoe 
“II Penseroso’’ 

Childe Harold 
Oliver Twist 
Canterlury Tales 


1. Browning 

2. Byron 

3. Chaucer 

4. Cooper 

5. Dickens 

6. Eliot 

7. Hawthorne 

8. Irving 

9. Kipling 

10. Milton 

11. Poe 

12. Scott 

13. Shakespeare 

14. Shelley 

15. Tennyson 


Directions: Below are two lists of expressions. The first 
is composed of definitions or explanations and the second of 
terms. Bead each definition or explanation and then find in 
the second list the term which is defined. Place the letter in 
front of this term on the short line in front of the definition 
or explanation. 


1. Absorption of nourishment 

from another plant 

2. Combination of carbon dioxide . 

and water 

3. Detection of direction of pull 

of gravity 

4. Diffusion through a membrane 

5. Hole in epidermis 

6. Leaf-stalk 

7. One of the algae 

8, One of the composite family 

9. One of the fungi 

10. One of the gymnosperms 

11. Portion of cell 

12. Underground stem 


a. blade 

b. coconut palm 

c. cytoplasm 

d. dandelion 

e. enzyme 

f. geotropism 

g. mountain laurel 

h. osmosis 

i. petiole 

j. photosynthesis 

k. pleurococcus 

l. rhizome 

m. stoma 

n. symbiosis 

0 . transpiration 

p. wheat rust 

q. white pine 

r. wintergreen 



MATCHING TESTS 


383 


Directions: In the right-hand column below are twelve 
algebraic expressions and in the left-hand column sixteen 
such expressions. Some one of the expressions in the right- 
hand column is equivalent to each of the expressions in the 
left-hand column. Find these equivalent expressions and in- 
dicate which is equivalent to each of those in the left-hand 
column by writing its number just in front of the latter. For 
example, the third expression in the right-hand column is 
equivalent to the first one in the left-hand column, there- 
fore “3” is written on the short line in front of the latter. 


_3_ 5 x(4x^ — y) 

V16x* 

64 ? 

_ 2y(4x*4-y") 

— (x-f-2y)(x — 2y) 

— 

_ (x + 2y)* 

V36y* 

— (x -f 2y) (x — y) 

X* + 2xy -|- y® 

x®-y® 

__ 3(x® + y®) 
12(x‘-y*) 




1. 4x» 

2. X® -t- 4xy + 4y® 

3. 20x® — 5xy 

4. 4y® 

5. 6y® 

6. 2x®\/5’ 

ij ^ "1“ y 
’ X — y 

8. 4x® 

9 3 

''•4(x® + y®) 

10. X® — 4y® 

11. x®-{-xy — 2y* 

12. 3y® 

13. X® — xy — 2y* 

14 

4(x® — y®) 

15. 8x® 

16. 8x®y + 2y* 


The next example varies slightly from the last three 
though it is not essentially different. It contains the 



384 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

same number of items in eaeli column but not all of 
tbe items match. 

Directions: You see below one list of twelve chemical ele- 
ments and another of twelve symbols. The symbols which 
stand for most of the elements named, but not for all of 
them, are given. Look at the name of each element and see 
if you can find the correct sjonbol for it. If you can, copy the 
letter in front of this symbol on the short line in front of the 
name of the element. 


1. Antimony 

a. A. 

2. Calcium 

b. Au 

3. Copper 

c. C 

4. Gold 

d. Cu 

5. Iodine 

e. Pe 

6. Iron 

f. Hg 

7. Mercury 

g. I 

8. Phosphorus 

h. Na 

9. Sodium 

i. P 

10. Sulfur 

2 . S 

11. Tin 

k. Sb 

12. Zinc 

1. Ti 


3. Compound matching tests. The compound match- 
ing test does not differ from the ordinary one except 
in having a third and sometimes even a fourth or 
fifth column. In the three examples given below only 
three columns are used, and it is very rarely desirable 
to increase this number. This type of test should 
probably not be used in primary and intermediate 
grades, but in the upper elementary and high-school 
years it is frequently valuable. 

Directions: Below are three lists. The first consists of the 
names of prominent historical characters, the second of im- 
portant events, and the third of dates. Look at the name of 
each character and find the event in the list with which he 
was closely connected or in which he took part, and the 



MATCHESTG TESTS 


385 


date at which it occurred. Write the letter in front of the 
event and also that in front of the date on the blank line in 
front of the name of the character. For example, the first 
name is '‘James I.’’ He was closely connected with the acces- 
sion of the Stuarts since he was the first king of this house 
to reign over England. This event occurred in 1603, there- 
fore and "i” are written upon the line in front of 

"James!.’’ 


1. James I 

2. Peter the 

Great 

3. William of 

Orange 

4. Louis XVI 

5. Nelson 

6. Louis XIV 

7. Kossuth 

8. Kaiser Wil- 
liam I 

9. Grand 

Duke Nich- 
olas 

10. Metternich 

11. Frederick 

the Great 

12. Charles II 


A. Expulsion of 
Stuarts 

B. War of the 
Spanish Suc- 
cession 

C. Hungarian 
Revolution 

D. Holy Alliance 

E. Accession of 
Stuarts 

F. World War 

G. The Restoration 

H. War of theAus- 
trian Succes- 
sion 

I. Battle of Tra- 
falgar 

J. Franco-Prus- 
sian War 

K. French Revolu- 
tion 

L. Rise of Russia 


a. 1689-1725 

b. 1789-95 

c. 1805 

d. 1870-71 

e. 1688 

f. 1740-48 

g. 1815 

h. 1914-17 

i. 1603 

j. 1848 

k. 1660 

l. 1701-13 


Directions: On the page below you see a list of English 
words followed by two lists of Latin words. For most of the 
English words you will find a corresponding Latin word in* 
each of the two Latin lists. Find these two words and write 
the number in front of the first and the letter in front of the 
second on the blank line just before the English word. Thus 
the first English word is "approach.” Since "accedo” in the 
second column means "approach,” "1” is written upon the 
blank line in front of "approach,” and since "advenio” in 



386 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


the third colunm likewise means “approach,” there is like- 
wise a “b” on the same blank line after the “1.” 


I. 

approach 

1. accedo 

a. 

abeo 

II. 

attack 

2. contendo 

b. 

advenio 

III. 

capture 

3. convenio 

c. 

aggredior 

rv. 

carry 

4. discedo 

d. 

arbitror 

v. 

depart 

5. emo 

e. 

capio 

VI. 

find 

6. expugno 

f. 

clamo 

VII. 

give 

7. fero 

g- 

cupio 

Vni. hasten 

8. impero 

h. 

habito 

IX. 

help 

9. invenio 

i. 

incendo 

X. 

kill 

10. neco 

j- 

interficio 

XI. 

know 

11. oppugno 

k. 

iubeo 

XII. 

live 

12. puto 

1. 

iuvo 

XIII. 

order 

13. scio 

m. 

porto 

XIV. 

think 

14. terreo 

n. 

propero 

XV. 

wish 

15. vivo 

0. 

reperio 


Directions : In the first list below are the names of a number 
of foreign cities. In the second list are the names of rivers, 
and in the third of countries. Look at each city in the first 
list and then find the river in the second list upon which the 
city is located and the country in the third list in which 
the city is situated. Write the number in front of the river 
and the letter in front of the country on the blank line in 
front of the city. 


I. Bagdad 

II. Benares 

III. Buenos 

Aires 

rV. Buda- 

pesth 

V. Cologne 

VI. Lisbon 

VII. Lyons 

_ VIIL Mont- 
real 

IX. Nanking 

_ X. Seville 


1. Danube 

2. Ganges 

3. Guadalquivir 

4. Orange 

5. Orinoco 

6. Plata 

7. Po 

8. lUtiine 

9. Bhone 

10. St. Lawrence 

11. Tagus 

12. Tigris 

13. Volga 

14. Tellow 

15. Yang-tse-kiang 


a. Argentina 

b. Brazil 

c. Canada 

d. China 

e. Prance 

f. Hungary 

g. India 

h. Italy 

i. Japan 

j. Mesopotamia 

k. Portugal 

l. Prussia 

m. Bussia 

n. Spain 
0 . Turkey 



MATCHING TESTS 


387 


4. Summary. Matching tests are one of the two or 
three best types of the new examination and should he 
among those receiving the most frequent use. There 
is almost no subject or portion of a subject to which 
they cannot be well adapted. The number of items 
should vary from ten to twenty and the arrangement 
within each list should be random. It is usually better 
to have more items in one list than in the other or 
else some in both which do not match in order that 
pupils who know the majority of them cannot by a 
process of elimination get the others correct. Match- 
ing tests may be classified into two main divisions: 
ordinary or simple matching tests in which there are 
only two lists of items, and compound matching tests 
which contain three or occasionally even more. This 
latter variety should not be used in the lower grades. 



CHAPTER XV 


INGOEEECT-STATEMENTS TESTS 

I. General discussion. The incorrect-statements type 
of test is composed of statements some or all of which 
are incorrect, but can be made correct by the change, 
insertion, or removal of a word or two. In no case is 
there any indication given the pupils as to the point 
in the sentence at which the change is needed. Some- 
times all that pupils are asked to do is to indicate the 
words which make the statements incorrect, but it is 
usually better to have them supply the correct words 
as well. 

Although this t 3 T)e of test has not received nearly 
as wide use as at least four or five of the other types, 
it possesses a considerable number of merits. It ap- 
pears to test knowledge of complete thoughts and other 
abilities than mere memory of facts better than do 
several of the other types. Practically anything which 
can be dealt with hy complete statements in other t3rpes 
of tests can also be tested by the use of this type. An- 
other advantage is that there is practically no chance 
element or opportunity of guessing present. 

On the other hand incorrect-statements tests in com- 
mon with completion tests are relatively difficult to 
construct so that the objectivity of scoring is high. Pu- 
pils win frequently find ways of changing the incor- 
rect statements other than those thought of by the 
teacher, and in some eases it wiU be difficult to decide 

388 



INCORRECT-STATEMENTS TESTS 


389 


whether some of these unforeseen changes should he 
given credit as correct or not. 

The same charge is sometimes made against incor- 
rect-statements tests as has already been mentioned in 
the case of alternative and other types, that pupils 
may be confused by the presentation of false state- 
ments. The reply is, of course, the same here as else- 
where, that, if they have studied the material fairly 
well, no serious confusion should result from so doing. 

In general the same suggestions which were given 
under alternative tests in regard to the wording of the 
exercises apply here also. Perhaps the one most im- 
portant point to keep in mind is that rarely, if ever, 
should a word to be crossed out, or one to be inserted, 
be a mere negative. It is very rarely if ever desirable 
to include a statement which should have more than 
one change made in it. However, statements such that 
they may be changed in any one of two or even more 
ways so as to be correct are often satisfactory^ 

The proportion of incorrect to correct statements 
may be in any ratio. Sometimes a few incorrect ones 
are inserted among a rather large number of correct 
ones, sometimes the numbers are approximately equal, 
sometimes the majority are wrong, and occasionally 
even all the statements are incorrect. There is also 
a diversity of practice in the matter of whether or not 
the same type of correction is called for by all the 
statements in a single test. It seems as a general rule 
better not to include in a single test some statements 
in which words are to be inserted, others from which 
they are merely to be crossed out, and still others 
which require the replacing of incorrect words by cor- 
rect ones. At least in the elementary grades such a 
mingling of different varieties is confusing and likely 



390 TEADITIONAI/ EXAMINATIONS AND NEW-TYPE TESTS 

to bring about the result that pupils do not have 
clearly in mind just what they are to do, and even in 
high school it has little to recommend it. 

The ordinary method of scoring such tests is to 
count one point for each statement correct. If, as is 
usually the case, some of the statements are correct 
as they stand, one point is usually allowed for each of 
these which is not changed. In other words, pupils are 
given credit for recognizing the correctness of state- 
ments as well as for knowing which are incorrect and 
being able to correct them. 

Although this type of test should probably not be 
one of the two or three kinds to receive more frequent 
use, the. writer recommends that it be employed more 
often than is in general true at present. Especially 
with more mature pupils does it seem to have decided 
value. 

2. Examples of incorrect-statements tests. Five ex- 
amples of this type of test are all that are given in 
this chapter. These do not illustrate all of the slight 
variations in possible forms but probably do illus- 
trate all those of much merit. Most of the possible 
variations are so slight as hardly to be worth illustra- 
tion with separate tests. The examples given in con- 
junction with the discussion in the preceding section 
are sufficiently suggestive of the kinds of incorrect- 
statements tests which teachers may prepare. 

The first example illustrates the variety in which, 
pupils are directed merely to cross out the incorrect 
words and not to replace them. Most of the statements 
in this test are correct, therefore only a few should 
be marked at all. 

Directions: Below are ten statements dealing with facts 
you have studied in commercial geography. Most of the state- 



INCOREECT-STATEMENTS TESTS 891 

ments are true but a few of them are not true. Each of these 
incorrect statements could be made true by the omission or 
change of a single word. Read each statement. If you think it 
is true, go on to the next statement ; if you think it is not true 
draw a line through the one word which makes it false. 

1. Medieval fairs greatly stimulated wholesale trade. 

2. Broken plateaus and mountain slopes are suitable for 
grazing. 

3. The principle of maximum returns determines the locali- 
zation of industries. 

4. The Kiel Canal has the most traffic of any canal. 

5. The chief product of Denmark is butter. 

6. New England engages largely in fishing 

7. The Erie Canal connects Albany and Erie. 

8. The leading peanut market is Norfolk. 

9. The Black Hills of Dakota sdeld considerable gold. 

10. In California oats replaces corn as food for stock. 

The four other examples all provide for the cross- 
ing out of the wrong vrords and the writing in of the 
correct ones. The first of them, which is similar to the 
one just given in that it contains only a few errors, 
illustrates a possibility in arithmetic. In the second a 
majority of the statements should have some change 
made in each, and in the last two all of the statements 
are wrong. It will be noticed that in the last one pro- 
vision is made for recording the proper word to be sup- 
plied on a blank line in front of each statement so 
that scoring is somewhat facilitated. It would be pos- 
sible also to provide another blank line on which the 
incorrect word should be written. 

Directions; On the page below are twenty multiplication 
examples with answers. Most of the answers are right, but a 
few are wrong. Look through the examples and decide which 
are right and which are wrong. "Wherever you find one that is 
wrong, draw a line through the answer and write the correct 



392 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


answer immediately after it. To illustrate this, the first ex- 
ample is “6X9 = 56.” This is wrong, since 6X9 = 54. 
Therefore you should draw a line through the “56” and 
write “54” just after it. Now go ahead and do the others, 
marTring each One right or else crossing out the wrong answer 
and putting the right answer there. 


1. 6 X 9 = 56 

2. 4X18 = 72 

3. 12X11 = 132 

4. 8X19 = 152 

5. 9 X 13 = 117 

6. 7 X 16 = 122 

7. 11 X 14 = 154 

8. 8 X 17 = 136 

9. 13 X 12 = 156 

10. 14 X 9 = 126 


11. 17X7 = 119 

12. 6 X 23 = 118 

13. 4X37 = 148 

14. 8 X 23 = 184 

15. 12 X 16 = 192 

16. 7 X 24=168 

17. 9X16 = 144 

18. 14 X 13 = 182 

19. 11 X 16 = 176 

20. 4X29 = 116 


Directions: Most of the statements on this page are wrong 
though a few are right. Bach of the wrong statements is 
made untrue because of some one word in it. Bead each state- 
ment and, if you think it is wrong, cross out the word which 
makes it wrong and write the correct word, that is, the word 
which will make it true, immediately above the word you 
have crossed out. For example, if you had such a sentence as 
“The nerve of sight is called the olfactory nerve” which is 
not true, you could make it true by crossing out “sight” 
and writing in “smell” immediately above it, or else by cross- 
ing out “olfactory” and writing “optic” immediately above 
it. If a sentence is correct as given do not cross out any word 
or make any mark in it. 

1. Food is entirely digested in the stomach. 

2. The hard shell on the outside of the visible part of a 
tooth is the dentine. 

3. A food rich in fat yields more heat and energy than one 
rich in sugar. 

4. The most important function of the red corpuscles in the 
blood is to destroy bacteria. 

5. "When a dislocation occurs the nearby ligaments are 
rarely torn. 



INCOREECT-STATEMENTS TESTS 


3d3 


6. The basis of the voice is a sound made in the pharynx. 

7. The medulla oblongata is located in the upper part of 
the brain. 

8. The outer layer of the skin is called the epidermis. 

9. The normal temperature of the body is a fraction above 
99°. 

10. The purely liquid part of the blood is called the lymph. 

Directions: Bach of the statements given below is wrong 
because of some one word contained in it. Find the wrong 
word in each statement, draw a line through it and write in 
the correct one, that is, the word which will make the state- 
ment true, immediately above the wrong word. 

1. The three main parts of a leaf are the stem, petiole and 
leaf-base. 

2. Proteins are formed in living plant cells by combining 
the products of photosynthesis with hydrogen. 

3. One result of alcoholic fermentation is to produce carbon 
dioxide in place of nitrogen. 

4. The leaves of palms are called fronds. 

5. Fossils are the remains of past plants or animals pre- 
served in sand. 

6. The Mendelian law classifies characteristics as dominant 
and inherited. 

7. The common grains, such as wheat, corn, oats, and so 
f orth^ are members of the arum family. 

8. There are three main groups of dicotyledons. 

9. An. imperfect fiower is one which does not have both 
stamens and petals. 

10. The recombination of inorganic elements into organic 
compounds is the function of transpiration. 

Directions: Each of the ten statements given below is un- 
true because there is some one word in it which is wrong. Read 
each statement, find the wrong word, and draw a line through 
it. Then write the word which should take the place of the 
wrong word so as to make the statement true on the blank 
line in front of each statement. 

1. Scissors are four inches or less in 

length while shears are longer. 



394 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

2. Needles ordinarily come in ten sizes. 

- 3. Gingham is made of linen. 

4. The stitch used in beginning a button- 
hole is called the blanket stitch. 

5. The warp is the part of cloth which 

runs across the piece. 

6. Most men’s overalls are made of flan- 
nel. 

7. Printing usually results in a more last- 
ing color than other methods of dyeing. 

8, Cotton clothing usually takes up 

moisture more readily than woolen 
clothing. 

9. A French seam is one so made that the 

raw edge of the cloth is exposed. 

10, The kind of hand stitching which looks 

like machine stitching on the right side 
is called feather stitching. 

3. Sianmary. The incorreet-statements type of test 
possesses a considerable degree of merit and deserves 
wider use than it is at present receiving. It is rela- 
tively difficult to construct so as to yield highly 
objective scores and has certain other minor disadvan- 
tages. In such tests the proportion of incorrect state- 
ments may vary from very small to all. Ordinarily 
one point is given for each statement correctly changed 
or recognized as being already correct. 



CHAPTER SVI 


MISCELLANEOUS TYPES OF THE NEW 
EXAMINATION 

1. General discussion. In addition to the six types of 
the new examination dealt with in the last six chap- 
ters, there are a number of others which, because of 
the limited number of varieties of each, the fact that 
they are appropriate in comparatively few subjects or 
portions of subjects, and various other reasons, do not 
seem to merit treatment in separate chapters. Several 
of these types are more or less similar to certain of 
the more important ones already described, but still 
possess distinctive features which appear to justify 
classifying them under other heads. The tests treated 
in this chapter have been grouped under five headings 
as follows: 

Identification Tests 
Distinguishing Tests 
Continuity or Rearrangement Tests 
Verification or Judgment Tests 
Analogies Tests 

Each of these varieties will be discussed and illustrated 
in one of the following sections of this chapter. 

2. Identification tests. This name has been chosen to 
apply to tests which consist of a picture or figure, 
different features of which are in some way to be 
identified. Some, or perhaps all, of the varieties of 
identification tests might have been included under 

396 



396 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

matching tests since they call for the matching of 
terms with parts or features of the figure or diagram. 
One or two of the varieties, however, could scarcely be 
properly so classified and all of them are distinguished 
by the use of pictures or diagrams ; therefore it seems 
best to treat them as a separate type. 

Identification tests have little place in some subjects 
such as literature, foreign language, civics, and so 
forth, but in all of the natural sciences, in geography, 
history, geometry, manual training, and so forth, there 
are many facts and relationships which may be very 
satisfactorily tested by their use. Indeed, they measure 
certain abilities and types of knowledge which it is 
frequently rather difficult to test without the use of 
figures or diagrams. Most varieties of them can be 
made entirely objective in scoring. The one serious 
lindtation on the use of such tests is that they are 
somewhat difficult to prepare. It is not as easy to con- 
struct and produce duplicate copies of pictures and 
, figures as of ordinary written material. In many eases, 
I however, the necessary drawings can be placed upon 
^ the board, although this is not possible in all varie- 
ties. In others such simple drawings can be used that 
the difficulty of their construction is not very great. 

The general method of scoring such exercises is to 
allow one point for each identification correctly made. 
In other words, it is essentially the same as for match- 
ing tests. 

The first illustrative test given is in the field of 
geography. It presents two circles, one of which is as- 
sociated with the names of the six continents and the 
other with those of the five oceans. Each circle is di- 
vided into the appropriate number of parts propor- 



MISCELLANEOUS TYPES 


397 


tional in size to the areas of the continents and oceans 
respectively. These parts are numbered, and, hy writ- 
ing the numbers in front of the names of the conti- 
nents and oceans, pupils are to indicate their knowledge 
of the relative sizes thereof. In one respect this test 
is open to the same criticism made of one form of 
matching tests, that since all the items in each set are 
to be matched, it is possible to arrive at the last re- 
sponses by a process of elimination. However, this 
cannot well be avoided in some cases, and in such a 
test as that given does not appear to be a serious 
objection. 


Directions: On the page below you see two circles, each 
divided into parts. One circle, which is divided into six parts, 
has the names of the six continents beside it. The sizes of the 
six parts of the circle are proportional to the areas of the 
continents. You are to indicate the relative sizes of the con- 
tinents by writing the figure in each part of the circle in front 
of the name of the continent which is represented by that 
part of the circle. The second circle is divided into five parts 
to represent the five oceans which are named beside it. Do 
the same -with this, writing the number in each part in front 
of the ocean to which that part corresponds. To make sure 
that you understand what you are to do, it has been illustrated 
by one example. No. 1 in the circle representing the con- 
tinents is in the smallest part of the circle. Since lie smallest 
continent is Australasia, a “1” has been written in front of 
it. Go ahead and do the other continents and oceans in the 
same way. 



Continents 

Africa 

Asia 

/Australasia 

Europe 

North America 
South America 



Oceans 

Antarctic 

Arctic 

Atlantic 

Indian 

Pacific 



398 TRABIMONAL EXAMINATIONS AITO NEW-TYPE TESTS 

The next example presents a couple of exercises 
which might he used in physics. It differs from the 
previous one in that more portions of the figures are 
indicated than there are terms with which these 
are to he matched. In other words, it renders im- 
possible determining the last responses by elimi- 
nation. 


Directions ; Below is a representation of a prism with, lines 
indicating the separation of light into the seven colors. The 
lines which represent these colors are lettered. Beside the 
figure you see the names of five of the seven colors. Write the 
letter of the line which shows the position of each color in 
front of the name of that color. 


blue green orange red violet 



relative heights of these columns represent the specific grav- 
ities of a number of substances. Five of these are named at 
the right of the columns. Decide which column represents the 
specific gravity of each of the five substances and write the 
letter found on that column in front of the substance. 



MISCELLANEOUS TYPES 399 



The next test illustrates a possible kind of map work 
for use in geography. It presents outline maps of the 
United States and of South America. On each are a 
number of figures which indicate spots which are im- 
portant centers for certain products. These products 
are named and the proper number from the map is to 
be written in front of each. 

Directions : Below is a map of the United States with eight 
large figures on it. Above the map is a list of eight products. 
Bach of the figures is located in a spot well known for the 
production of some one of the articles named. For example, 
the figure “1” indicates a region where a great deal of gold 
is mined. Therefore it has been written in front of the word 
“gold” in the list. You are to do the same with each of the 
other figures, that is, write it in front of the name of the 
article produced in that part of the country indicated by the 
figure. Below the map of the United States is another map 
representing South America. It also has eight figures upon it 
and a list of products beside it. Do just the same for that as 
for the one of the United States, that is, write each figure in 




400 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


front of the product which comes from the part of South 
America shown by the figure. 

cotton i gold iron ore packed meat 

fish grain steel sheep 




MISCELLANEOUS TYPES 


401 


The next two exercises are from general science. 
They differ from those already given chiefly in one 
minor point, that is, that, instead of having portions 
or features of the figure numbered or lettered, the 
terms given are numbered or lettered, and these num- 
bers or letters are to be written upon the figures at the 
proper places. In the ease of the first exercise it ap- 
pears that this should not prevent scoring from being 
perfectly objective, but in the case of the second and 
others similar to it, it would probably be difficult at 
times to determine whether the letter or number had 
been placed on just the right portion of the figure or 
not. However, this is not a serious matter and should 
not hinder the use of this form occasionally. Ordi- 
narily, however, it is probably best to avoid it by let- 
tering the figure rather than the terms. 


Directions : Below you see a cross to the right of which are 
eight small circles. The cross represents the sun and each of 
the circles one of the planets. The distances of the circles 
from the cross are proportional to the distances of the plan- 
ets from the sun. Below the row of circles you see the names 
of the planets each with a number in front of it. Write the 
number in front of the name of each planet on or close to the 
circle which represents that planet. 

o o o 9 


(1) Earth, (2) Jupiter, (3) Mars, (4) Mercury, (5) Nep- 
tune, (6) Saturn, (7) Uranus, (8) Venus. 


Directions: Below is a drawing of the human eye. At the 
right of this drawing are the names of ten parts of the eye. 
Write the letter in front of each name on the part of the eye 



402 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


to which that name refers. Be as careful and exact as pos- 
sible in placing each letter exactly where it belongs. 



a. aqueous humor f . lens 

b. blind spot g. optic nerve 

c. choroid coat h, retina 

d. cornea i. sclerotic coat 

e. iris j. vitreous humor 


The last examples of identification exercises are 
from the field of geometry. The first presents a triangle 
and the second a circle with a diameter, radius, chord, 
and tangent Beside each figure the names of a number 
of terms are given. Most, but not all, of these terms are 
illustrated by portions of the figures. Pupils are to 
identify all those which are so illustrated by lettering 
or numbering the figures. 


Directions: You will see below a triangle and beside it a 
list of nine terms. Most, but not all, of these terms are illus- 
trated by some part of the triangle- Write the letter in front 
of each term which is so illustrated on the part of the tri- 
angle which represents it. Be very careful to locate your 
letters exactly. When you have finished doing this for the 
triangle, do the same for the circle and accompanying lines. 



a. acute angle 

b. hypotenuse 
c- leg 

d. right angle 

e. vertex 


f. complemen- 
tary angle 
exterior angle 
obtuse angle 
supplemen- 



tary angle 

a. chord 

b. diameter 

c. inscribed circle 

d. radius 

e. secant 

f. sector 

g. tangent 



MISCELLANEOUS TYPES 


403 


3. Distinguishing tests. A type of test which perhaps 
scarcely belongs in this chapter but which is included 
here because it does permit a fair degree of objec- 
tivity in scoring is that which calls for the making 
of distinctions between pairs of terms or, in rare cases, 
among three or more terms. These terms may either 
be words which are more or less but not exactly syn- 
onsnnous or words between which there is any degree 
of diiference. This variety of test has value in any 
subject in which the acquisition of exact definitions and 
fine distinctions of meanings is at all important, and 
even in some cases in which it is not. The score usually 
consists of one point for each distinction correctly 
given, though sometimes two points are given for a 
correct distinction and one for a partially correct one 
which seems too good to be rated zero. 

This variety of test is illustrated by three examples. 
The first presents synonymous words from Latin, the 
second synon3Tnous terms from the field of economies, 
whereas the third presents biological terms which are 
not at all synonymous. 

Directions: Below you see ten pairs o£ Latin words. The 
two words after each number mean practically the same thing, 
but there is in every case a difference between them. On the 
line following each pair of words state as clearly and briefly 
as possible what this difference is. 

1. vir 

2. magister 

3. clipeus 

4. dux 

5. scio 

6. et 

7. a 

8. audax 

9. dico 

10. uterque 


homo 

dominus . 
scutum _ 
imperator 
intellego , 

que 

ah 

fortis 

narro 

ambo 



404 TEADITIONAL EXAMCSTATIONS AND NEW-TYPE TESTS 

Directions : Each pair of terms in the list below means about 
but not exactly the same thing. On the line following them 
explain as exactly and in as few words as possible what the 
difference in each case is. 

1. economy and saving 

2. wealth and prosperity 

3. luxury and comfort 

4. scarcity and rarity 

5. value and desirability 

6. industry and thrift 

7. rivalry and competition 

8. liberalism and single tax 

9. property and capital 

10. partnership and corporation 

Directions : Below are ten pairs of terms from biology. The 
terms in each pair are more or less alike, but yet decidedly 
different. State as briefly and well as possible what the dif- 
ference between each pair of terms is. 

1. animal and plant 


2. algae and fungi 


3. respiration and transpiration 


4. families and orders 


5. primary and secondary roots 


6. myriopoda and Crustacea 








MISCELLANEOUS TYPES 


405 


7. grafting and budding 


8. sepal and petal 


9. gymnospermse and angiospermse 


10. clamatores and oseines 


4, Continuity or rearrangement tests. This name is 
applied to tests which present a series of items in con- 
fused or random order and requires pupils to rear- 
range them in the proper order according to some des- 
ignated basis. For example, historical characters or 
events may be given to be arranged in chronological 
order or perhaps in order of importance; in physics 
or chemistry the names of substances may be given 
to be arranged in order of specific gravity; and so on. 
It will at once be seen that there is little place for this 
type of test in many kinds of subject-matter, but on 
the other hand there are niunerous instances in which 
it may properly be used. In many situations it is much 
more important for pupils to know the relative order 
of items than to know the exact date, amount, or some 
other characteristic of each, and this type of test meas- 
ures just such knowledge. It is easy to construct and 
administer and in general readily imderstood by pu- 
pils. 

The number of items in a single exercise of this 
sort should not be ^eat, certainly rarely, if ever, more 
than ten. Indeed, in many instances about six would 
seem to be a more desirable number. 







406 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

Although continuity tests can easily be constructed 
so that scoring is perfectly objective, this does not 
mean that scoring is necessarily easy. If the score 
is determined simply by counting one point for each 
item correctly placed, there is no difficulty in com- 
puting it, but this method is hardly satisfactory. For 
example, suppose that in a history test pupils have 
been given the following names of characters to ar- 
range in chronological order: “Jackson, Fra n k l in, Jef- 
ferson, Cleveland, Lincoln.” The correct arrangement 
is, of course, “Franklin, Jefferson, Jackson, Lincoln, 
Cleveland.” Suppose, however, that some pupil rear- 
ranges them as follows: “Cleveland, Franklin, Jeffer- 
son, Jackson, Lincoln.” Taking the group as a whole, 
he has no one in the proper place, yet the last four 
are all arranged correctly with respect to one another. 
Evidently such a pupil should receive some credit for 
his response. To take a more likely example, suppose 
another pupil arranges them in the order: “Jefferson, 
Franklin, Jackson, Cleveland, Lincoln.” It is evident 
that he also has some knowledge of their chronolog- 
ical arrangement and deserves more credit than a pu- 
pil who did not answer at all or who arranged them 
in such an order as: “Lincoln, Cleveland, Jackson, Jef- 
ferson, Franklin.” 

The suggested method of scoring which allows for 
differences of the sort just illustrated is based upon 
the sum of the differences between the true order of 
items and those given by the pupils. Several slightly 
different procedures for computing the sum and allot- 
ting scores according to it have been suggested, but 
there is little essential difference between them. Per- 
haps as simple and satisfactory a method as any is 
as follows: Let the total number of points of credit 



MISCELLANEOUS TYPES 


407 


on any particular exercise be equal to the greatest 
possible sum of differences, which is the sum of the 
differences between the true order and a completely 
inverted order, that is, an order in which the item 
No. of S nm nf ^ which should come first is last, that 
which should come second is next to last, 
and so on. For different numbers of 
items from two up to ten inclusive, the 
greatest possible sums of the differences 
are shown by the two columns at the 
left. From these figures it is seen that 
the greatest possible sum of differences 
is two if there are only two items, four 
if there are three, and so on. In employ- 
ing this method of scoring, the sum of the. differences 
between the order actually given by a pupil and the 
correct order is found. This sum is subtracted from the 
greatest possible sum and the remainder is the pupil’s 
score on that exercise. A pupil whose order is entirely 
correct has no differences, therefore nothing to sub- 
tract, and so receives the greatest possible score, 
whereas a pupil who has exactly inverted order re- 
ceives a score of zero because the amount to be sub- 
tracted is equal to the total number of points allowed. 

This method may be illustrated by applying it to 
the cases cited above in the text. The pupil who placed 
Cleveland first and the other four after him in correct 
order had the following differences: Cleveland fifth 
instead of first, Franklin second instead of first, Jef- 
ferson third instead of second, Jackson fourth instead 
of third, Lincoln fifth instead of fourth. The sum of 

iXf the number of items is even, the sum of the differences is equal to one- 
half the square of the number; if it is odd, the sum equals one-half of one less 
than the square. 


items diff. 

2 2 

3 4 

4 8 

5 12 

6 18 

7 ' 24 

8 32 

9 40 

10 50 



408 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

these differences is 8(4-|-l+l+l'i“l)* Subtracting 8 
from 12, the greatest possible sum of the differences 
for five items, yields a score of 4 for this pupil. To take 
the second case mentioned above, that of the pupil 
who arranged them in the order: “Jefferson, Frank- 
lin, Jackson, Cleveland, Lincoln,” we find differences 
as follows: Jefferson first instead of second, Franklin 
second instead of first, Jackson correct, Cleveland 
fourth instead of fifth, and Lincoln fifth instead of 
fourth. In other words, in the case of each man except 
Jackson there is a difference of one or a total differ- 
ence of four, so that this pupil’s score is 12 — 4, or 8. 

At first inspection this method of scoring may ap- 
pear somewhat elaborate and burdensome, but, after 
one becomes acquainted with it, it is relatively simple 
and easy in its application. It is, therefore, recom- 
mended for use with such tests. 

Three examples of this type of test are included. The 
first of these, in cooking, calls for the rearrangement 
of various articles of food on the basis specified by the 
statement accompanying each group of items. The sec- 
ond is in the field of history, in which this type of test 
should perhaps receive its most frequent use. It pre- 
sents names of characters and also of events which 
are, of course, to be arranged in chronological order. 
The third example is more unusual. It might be called 
a dictionary test, though perhaps this is not the best 
name for it. It deals with the ability of pupils to ar- 
range words in dictionary or alphabetical order. 

Directions : Below are a number of groups each containing 
the names of six articles of food. In front of each group is a 
statement which directs that the terms composing that group 
be numbered in a certain order. For example, in Exercise 1 
they are to be numbered in order of protein content begin- 



MISCELLANEOUS TYPES 


409 


ning with the one which has the greatest. Therefore in this ex- 
ercise you should place a figure “1” in front of the article of 
food which has the greatest protein content, a figure ^^2'^ in 
front of the one which ranks second, a figure '‘3’’ in front of 
the one which is third, and so on down until you place a figure 
'‘6’’ in front of the one which has the least protein content. 
When you have completed Number 1 go ahead and do each of 
the others, numbering them according to the basis indicated in 
the statement. 


1. Number the following in order of pro- 
tein content, beginning with the one 
which has the greatest. 


white bread 
eggs 

round steak 
spinach 
butter 
potatoes 


2, Number in order of per cent of carbo- 
hydrate, beginning with the largest. 


sugar 
potatoes 
oatmeal 
lima beans 
rice 

canned salmon 


3. Number in order of length of time 
taken to digest, beginning with the 
one which takes the least time. 


4. Number in order of proportion of 
water contained, beginning with that 
having the greatest. 


hard boiled eggs 
boiled hominy 
whole wheat bread 
cheese 
green beans 
oranges 

tomatoes 

cabbage 

lettuce 

watermelon 

cucumbers 

radishes 


5. Number in order of time needed to 
sterilize when canning, beginning 
with the one needing the longest time. 


raspberries 
tomatoes 
sweet corn 
wax beans 
beets 

cauliflower 



410 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


6. Number the stages in cooking sugar 
in order of temperature, beginning 
with the lowest. 


small thread 

caramel 

crack 

blow 

soft ball 

pearl 


7. Number in order of temperature at 
which burning begins, starting with 
the lowest. 


butter 

lard 

olive oil 

crisco 

cottolene 

suet 


8. Number in order of per cent of fat 
content, beginning with the largest. 


milk 

butter 

cheese 

oatmeal 

wheat bread 

ham 


9, Number in order of per cent of min- 
eral matter, beginning with the 
greatest. 


dried beans 

lean beef 

cheese 

carrots 

milk 

spinach 


10. Number in order of time needed to 
cook, beginning with the longest. 


rolled oats 
whole oats 
corn meal 
steamed rice 
boiled rice 
hominy 


Directions : On the page below are ten groups each of which 
contains the names of five historical characters or of five 
events. These names are not arranged in correct chronological 
order, but you are to indicate this order by placing a 
front of the name of the character who lived first or of the 
event which occurred first of those in each group, a ‘‘2” in 
front of the second one in that group, and so on. For example, 



MISCELLANEOUS TYPES 


411 


of the characters named in Group I, Charlemagne lived first 
so a has been written in front of his name. William the 
Conqueror was the second of this group so a ‘‘2” has been 
written in front of his name. Go ahead and complete this 
group and then do each of the others in the same way. 


II III 


2 William the Conqueror 
1 Charlemagne 
Eichard the Lion-hearted 
Emperor Charles V 
Frederick Barbarossa 


Henry VIII 
Charles I 
James I 
Eichard III 
Edward VI 


Alfred the Great 
Joan of Are 
Eobert Bruce 
Edward I 
Otto I 


IV 


V 


VI 


Oliver Cromwell 
Martin Luther 
Gustavus Adolphus 
Henry IV of Prance 
Queen Elizabeth 


Eichelieu 

Mazarin 

William the Silent 
Sir Francis Drake 
Queen Isabella 


Erasmus 

Alcuin 

Michael Angelo 
Eoger Bacon 
Peter Abelard 


VII 


VIII 


The First Crusade 
The Battle of Hastings 
The Pall of Constantinople 
The Granting of Magna Charta 
The First War of the Eoses 


The Long Parliament 
The Spanish Armada 
The Union of England and 
Scotland 

The Discovery of America 
The End of the Thirty Tear’s 
War 


IX 

The Accession of Louis XIV 
The Conquest of Mexico 
The Adventures of Marco Polo 
The Last Crusade 
The Expulsion of the Stuarts 


The Battle of Bannockburn 
Final Expulsion of English 
from France 

Sobieski’s Defeat of the Mo- 
hammedans 
Battle of Crecy 
Pounding of the Jesuits 


Directions: After each Eoman numeral below you will see 
five words all of which begin with the same two letters. You 



412 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


are to look through each group of five words and determine 
the order in which these five words appear in a dictionary, 
that is, which one of the words comes first, which second, which 
third, and so on. Then place a figure “1"' in front of the 
word in each group which comes first, a ^'2’’ in front of the 
one which comes second, and so on. To show you how to do 
this it has been done for the first group of words. If you 
looked up these words in the dictionary, you would find that 
bough came before any of the others so it has a ‘‘1” in 
front of it; "‘bought’’ would be next, so it has a “2” in front 
of it; “boulder” would come next, so a “3” is in front of it; 
“bound” next, so it is numbered “4”; and “bounty” would 
be last, so it is numbered “5.” Now go ahead and do the same 
for the five words in each group. 


I 

bought 
bough 
bound 
^ boulder 
jL- bounty 

V 

ember 

embarrass 

embalm 

embark 

embellish 

IX 

instance 

instinct 

instil 

install 

instead 


II 

car 

captor 

carat 

caramel 

caravan 

VI 

for 

foot 

fool 

forage 

food 

X 

knight 

knot 

knit 

know 

knife 


III 

coal 

coach 

coarse 

coast 

coal-oil 

VII 

health 

heat 

heal 

head 

hear 

XI 

liquor 

liquid 

liquidate 

liqueur 

liquefy 


IV 

defeat 

defect 

deface 

defense 

default 

VIII 

hornet 

horror 

hope 

hoot 

horse 

XII 

mirage 

L- mirth 

miracle 

mirror 

mire 


5. Verification or judgment tests. This name is given 
to tests composed of statements or, in some cases, of 
questions which ask pupils to give reasons for them, 



MISCELLANEOUS TYPES 


413 


ordinarily jnst the one chief reason for each. It is a 
yaluable type of test from the standpoint of testing 
thought, but unfortunately its scoring cannot be made 
highly objective. Ordinarily there will be little doubt 
as to whether a suggested reason is true or not, but 
sometimes there will be, and still oftener it will be diffi- 
cult to decide whether or not it is the chief reason’. 
This can, of course, be largely avoided by the selection 
of facts for which there is one relatively undisputed 
chief reason. 

Two methods of scoring are used. Sometimes each 
reason is counted as merely right or wrong and some- 
times a scale of values running perhaps from zero up 
to five or even up to ten is used. The writer recom- 
mends, however, that if such a scale is employed it be 
very short, consisting merely of two points for en- 
tirely correct reasons and one point for those partially 
correct. 

Two of the three following examples of this type of 
test contain definite statements, one dealing with 
American history and the other with geography. The 
third is composed of questions in cooking. 

Directions: Below are statements of a number of events 
which have occurred in American history. On the blank line 
under each statement tell in as few words as possible the chief 
reason why the event named occurred. 

1. The coming of the Pilgrims to America. 


2. The final victory of the British in the French and Indian 
Wars. 


3. The selection of Washington as commander-in-chief of the 
Continental Army. 




414 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 
4r. The war with Mexico. 


5. The split in the Democratic party in 1861. 

6. The panic of 1894. 

7. The defeat of Bryan in 1896. 


8. The construction of the Panama Canal. 

9. The election of Wilson in 1912. 

10. The entrance of the United States into the World War. 


Directions: Each of the ten statements below states a fact 
which you should have learned in this course in geography. 
On the line below the statement of each fact give the chief 
reason why this fact is true. State each reason as clearly and 
briefly as possible. 


1. A degree of latitude at the poles is longer than one at 
the equator. 


2. In northern Greenland the compass needle does not point 
north. 









MISCELLANEOUS TYPES 


415 


3. The mercury in a barometer falls as one ascends a moun- 
tain. 


4. Bodies of water tend to equalize the temperature of the 
regions bordering them. 


5. Dew is formed more readily on a clear than on a cloudy 
night. 


' 6. The ocean is gradually becoming more salty. 


7. Sea water increases in density towards the bottom. 


8. Civilization appears to develop best in the temperate 
zone. 


9. The destruction of forests tends to cause floods. 


10. The windward side of mountains usually receives more 
rain than the other. 


Directions : The test below consists of a number of questions 
each of which should be answered by stating one chief reason. 
Write this reason, using a few words as possible, on the blank 
lines following each question. 

1. Why is fat necessary in one’s diet? 


2. Why should kitchen walls be finished in oil paint rather 
than water colors? 










416 TRADITIONAL EXAMINATIONS AlTD NEW-TYPE TESTS 

3. Why are vitamins necessary in diet? 

4. Why should dishes be rinsed with boiling water? 

5. Why is milk pasteurized ? 

6. Why does bread rise? 

7. Why is graham flour more nutritious than white flour? 

8. Why are such foods as ketchup, pickles, and so forth used? 

9. Why is alum baking powder undesirable ? 

10. "V^y should children not eat large amounts of cake ? 


6. Analogies tests. Analogies exercises have received 
more or less use in intelligence tests, but not very 
much in those in the school subjects. Apparently this 
is due partly to the diflSculty of constructing them and 
partly to the belief that even when they deal with 
material definitely studied they tend to measure gen- 
eral intelligence to a large degree. Bach exercise in 
an analogies test consists of an analogy, ordinarily 
presented in the same form as an arithmetical pro- 
portion, with one or rarely two of the terms omitted. 
The pupils are, of course, to supply these terms. 

It is evident that tests of this sort measure knowl- 











MISCELLANEOUS TYPES 


4ir 


edge of relationships rather than of simple discon- 
nected facts. Because of this function it seems desir- 
able to employ them occasionally though it is not 
recommended that their use be very frequent. They 
should probably not be employed at all in the lower 
elementary grades, very rarely in the upper ones, and 
only occasionally in the high school. 

Not only is it more or less difiScult to prepare such 
exercises but also it is not always easy to secure ob- 
jectivity of scoring. The situation here is the same as 
that with respect to the completion and other types of 
tests, that pupils will almost always give some re- 
sponses which are so near the borderline that it is 
hard to decide whether they are correct or not. As is 
shown by one or two of the examples, this can be 
avoided by using the multiple-answer feature, that is, 
by suggesting several possible answers from which one 
is to be selected. 

The score is practically always taken as the number 
of the correct words or expressions supplied. 

The first two of the four following examples are 
simple analogies tests with no suggested answers and 
differ from each other only in the form of presentation. 
The form employed in the first of the two, which is the 
more complete, is ordinarily to be preferred to the 
elliptical form in the second. The third example dif- 
fers merely in containing a list of suggested answers 
for each exercise. The fourth presents a compound 
analogies test in which two of the four terms are to 
be supplied by pupils. 

Directions: Below are a number of exercises dealing with 
different forms of various words. Each exercise begins with a 
blank. On this you are to write the form which has the same 
relation to the first underlined word in each sentence as 



418 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 


the second underlined word does to the third one. For 
example, the first exercise is: Lay is to laid as Ije is to 

lay. ^^ ‘^Lay’^ has been written upon the blank line because 
it is the present tense of ‘‘laid'’ just as “lie” is the present 
tense of “lay.” Look at each sentence beginning with No. 2 
and decide what word ought to be on the blank line, then write 
it there. 

1. - is to laid as Ue^ is to lay . 

2. is to run as saw is to seen. 

3. is to sheep as men is to man . 

4. is to good as quickly is to quick. 

5. is to every as ^ is to each . 

6. is to thief as dogs is to dog . 

7. is to know as brought is to bring. 

8. is to blow as is to eat. 

9. -■ . - - -- is to I as him is to h^ 

10. is to i^as ^ is to 

11. is to it as our is to 

Directions: On the page below are a number of exercises 
dealing with American history. Bach consists of three expres- 
sions followed by a blank line. These expressions are in the 
same form and are to be read in the same way as a proportion 
in arithmetic, that is, the first exercise, for example, means 
“New York is to the Dutch as Delaware is to the Swedes 
You are to write in a word on each blank which will make 
the exercise true. The correct word for the blank in the first 
exercise is “Swedes” since they settled Delaware just as the 
Dutch settled New York. Look at each of the other exercises 
and write in the correct word on each blank line. 

1. New York — ^Dutch, Delaware — Suxd£d 

2. Madison — 1809, John Quincy Adams — 

3. Northern Army — Grant, Southern Army — 

4. Hamilton — ^Federalist, Jefferson — 

5. Cartier — ^French, Cabot — 

6. Pennsylvania — Quakers, Maryland — 

7. English— Wolfe, French — 



MISCELIiANEOUS TYPES 


419 


8. Howe — sewing macliine, McCormick — . 

9. Louisiana — ^purchase, Texas — 

10. Revolutionary War — Treaty of Paris, War of 

1812— 

11. Cotton gin — ^Whitney, telegraph — - 


Directions : Below are ten exercises to which you are to re- 
spond, besides one which shows you what you are to do. Look at 
this first one. It reads: ^ ^Donnent is to donner as (dormirent, 
dorment, dormont, dormient) is to dormir/ ^ Of the four forms 
in the parenthesis ‘‘dorment’’ has been underlined because it 
bears the same relation to “dormir” as “donnent” does to 
“donner.” Look through the other exercises and decide which 
of the four words in each parenthesis bears the same relation- 
ship to the word at the end of the sentence as the first word in 
the sentence does to the second French word, and draw a line 
under this word in each parenthesis. 

1. Donnent is to donner as (dormirent, dorment, dormont, 
dormient) is to dormir. 

2, Perdrez is to perdre as (boirons, boiriez, buvez, boirez) is 
to boire. 

is to vouloir as (vis, vois, verrai, vit) is to voir . 

4. Sache is to savoir as (suives, suivre, suis, sus) is to suivre . 
Prenant is to prendre as (ferant, faisant, fisant, fairant) 
is to faire. 

6. Bonne is to bon as (beaux, belles, bel, belle) is to beau . 

7. Meilleur is to bon as (moindre, le pire, pire, le moindre) 
is to mauvais. 

8. Moi is to je as (lui, eux, soi, elle) is to ^ 

9. Ton is to ^ as (vos, leur, nos, notre) is to nous. 

Dix-sept is to sept as (treize, quinze, dix-six, seize) is to 
six. 

11. Elies is to elk as (moi, nous, vous, ils) is to 

Directions : Below are a number of exercises each of which 
consists of a proportion such as you had in arithmetic but 
actually in algebra. The first two terms of each proportion 



420 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

have been omitted and you are to fill them in on the blank 
lines so as to make correct proportions. Any two terms or ex- 
pressions which make true proportions may be written on the 
lines. In the first exercise, for example, “a“” has been written 
on the first line and “a” upon the second, but any other ex- 
pressions would do if the first were the cube of the second, 
since x® is the cube of x^. Now go ahead and write in expres- 
sions on the two blank lines in each exercise which make true 
proportions. 


1. . 

« to 

_ as X® is to x^. 

2. . 

38 to 

_ as 4a^ is to 2a^. 

3. . 

is to 

_ as 4a-b® is to 

4. . 

is to 

_ as 2xy is to (x + y) ^ 

5. . 

is to 

«as 3yV is to 

6. . 

is to 

. as x^z^ is to x^yz^. 

7. . 

is to 

« as a® is to a^. 

8. . 

_ is to 

..as — 3 is toV~3, 

9. . 

is to 

as 16 to is 2. 

10. . 

is to 

.. as b yiTis to VaP. 

11. - 

- is to 

» as (a -f- x)% is to (a + x 


7. Summary. Besides the six types of the new ex- 
^amination described in Chapters X to XV inclusive, 
there are certain other varieties which are used much 
less frequently chiefly because they are not appropri- 
ate in so many school subjects or portions of subjects. 
These are : identification, distinguishing, continuity or 
rearrangement, verification or judgment, and anal- 
ogies tests. Identification tests involve making the cor- 
rect connections between portions of figures or draw- 
ings and the names of these portions or other terms 
which apply. Distinguishing tests call for stating the 
distinctions between pairs of terms, the two in each 
pair being sometimes almost synonymous and some- 
times not even partially so. Continuity or rearrange- 
ment tests present series of items in confused order ' 



MISCELLANEOUS TYPES 


421 


and require pupils to indicate tlie order in whicli each 
series should be arranged. A verification or judgment 
test is composed of a number of statements or ques- 
tions for each of which the one chief or most impor- 
tant reason is to be given. Analogies tests are in the 
form of ordinary arithmetic proportions with one or 
sometimes two of the four terms omitted. 



CHAPTEE XVII 


OBJECTIVE TESTS IN INSTITUTIONS OF 
HIGHER LEARNING 

1 . General discussion. Although many, indeed most, 
of the principles laid doMPn and the suggestions made 
in the preceding chapters apply to testing achieve- 
ment in institutions of higher learning as well as in 
elementary and secondary schools, yet it seems appro- 
priate to devote some specific attention to the use of 
objective tests in normal schools, colleges, universi- 
ties, and other similar institutions. The purpose of 
this chapter is not chiefly to discuss the methods of 
constructing and administering tests in such institu- 
tions, hut rather to indicate their possibiliies for this 
purpose and to point out the considerable use which 
they are already receiving therein. The use of tradi- 
tional examinations in institutions of higher learning 
will not be dealt with. Also no attention will be given 
to objective tests employed in connection with college 
entrance or other similar examinations, inasmuch as 
these are tests of achievement in high school and not 
of that at a higher level. The remark may be made 
in passing, however, that in such important examina- 
tions as those of the New York Regents and of the 
College Entrance Examination Board, tests of the 
new type are being employed. Wood (97 and 94, pp. 
274-292), the College Entrance Examination Board 
Commission on New Types of Examinations (50), and 



OBJECTIVE TESTS IN INSTITUTIONS 


423 


others, have reported in more or less detail concern- 
ing the use of the new examination by these organiza- 
tions. Moreover, no space will be devoted to the dis- 
cussion of aptitude and prognostic tests, although 
many of this sort are being used to try to determine 
the ability of students to carry various college courses 
or to enter various vocations. 

It has sometimes been urged by college instructors 
and others interested that because much more memory 
work and many more detailed facts are appropriate 
in elementary and high school than in college and uni- 
versity, whereas the so-called higher thought proc- 
esses and activities should be called into play much 
more in such institutions, new-type tests are relatively 
unsuited for use in higher institutions. It is possible 
that there is some truth in this argument, but on the 
whole it does not seem to be very valid. A careful 
comparison of the subject-matter taught in high-school 
courses with that of undergraduate college courses 
indicates that probably as great a proportion of the 
total amount is factual material in the one case as in 
the other. The chief difference appears to be that more 
facts and more difficult facts, as well as more reflec- 
tive thinking and more difficult reflective thinking, are 
required in the college than in the high school. This 
justifies the position taken earlier that a testing pro- 
gram is not satisfactory and complete xmless it in- 
cludes both traditional and new-type examinations and 
also a number of varieties of each. Moreover, as has 
been stated earlier, it is not at all true that only tra- 
ditional examinations test reflective thinking whereas 
those of the new-type test merely factual information 
or knowledge. On the other hand, although there are 
certain desirable outcomes of instruction which ap- 



424 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

parenily only the traditional examination can measure 
satisfactorily, many types of objective tests can be 
made to measure varieties of reflective thinking. In 
considering the use of objective tests in institutions 
of higher learning it seems well to consider the latter, 
or, perhaps better, the courses offered in the latter, 
under two heads. These are what may be called liberal 
arts or general courses and professional or vocational 
ones. A section will, therefore, be devoted to the use of 
new-type tests in each of these groups. 

In one respect a slightly different policy will be fol- 
lowed in this than in the previous chapters. That is, 
a few standardized and partially standardized tests 
will be mentioned. In many cases it is difficult to draw 
the line between standardized and non-standardized 
tests, and this is especially true in the case of several 
prepared for use in college. 

2. Objective tests in liberal arts courses. Almost aU 
high-school subjects are very similar in content to sub- 
jects taught in liberal arts colleges, the chief differ- 
ence being that the former are more elementary; 
Therefore tests in the different high-school subjects 
such as have been illustrated in the last few chapters 
are likewise suitable for use in the same and similar 
subjects in college. Indeed, in many cases exactly the 
same tests are appropriate in both places. There is 
no reason why, for example, the same test that is used 
in first-year French or Spanish in high school should 
not be used in first-year college Spanish or French, 
though probably it should either be given earlier in the 
year or else higher scores expected upon it in college. 
Likewise either absolutely the same or very similar 
tests are appropriate in such subjects as algebra, the 
various natural sciences, history and the other social 



OBJECTIVE TESTS IN INSTITUTIONS 425 

sciences, the ancient languages, and so on. Therefore 
the reader who is interested in testing college classes 
in any of these subjects is referred to the tests for 
high-school use already given as models. 

In addition to this, however, it seems well to men- 
tion a number of instances of the actual use of tests 
in college courses both similar to those taught in high 
school and, therefore, already illustrated, and u n like 
them. No attempt will be made to refer to all such 
tests which have been reported in educational litera- 
ture, but merely a few of those employed will be 
described briefly as illustrative of what is being 
done. 

By far the longest and most worth-while discussion 
of the use of tests in various college subjects is that 
of Wood (94, pp. 177-292), who deals with the use of 
objective tests in contemporary civilization, physics, 
government, zoology, economics, philosophy, Greek art, 
history, and English in Columbia University. He gives 
examples of the exercises used in each subject, state- 
ments as to the results both as regards economy in 
giving and scoring, and reliability and validity of 
scores, and opinions of a number of instructors con- 
cerning new-tjrpe tests. It appears that the use of such 
tests has passed the experimental stage at Columbia 
University and may be considered as regular a part 
of the procedure as is the employment of essay exam- 
inations. It will be seen from Wood’s descriptions and 
illustrations that true-false statements, completion 
statements, multiple-answer exercises, and single- 
answer exercises are the four types commonly used. 

In several of the natural sciences, among other sub- 
jects, tests have been prepared and used. Hunter and 
Moss (38), for example, have constructed such a test 



426 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

in bacteriology. This consists of four subtests as fol- 
lows: 

1. Organisms, Methods, Infection, and Im m unity. 

Section A (Multiple-answer). 

Section B (True-false). 

2. Recognition and Diagnosis of Bacterial Micro-Organisms 
(Matching) . 

3. Laboratory Procedure (Rearrangement). 

4. Identification of Micro-Organisms from Lantern Slides 
(Identification) . 

This is a semi-standardized, perhaps even fully 
standardized test, and is available for general use.^ 

In chemistry a test by Webb ( 89 ), apparently used 
at George Peabody College for Teachers, may be cited. 
This is a classification test of the multiple-answer 
type. It consists of a list of fifty substances which are 
to be classified as elements, compounds, or mixtures. 
It is readily evident that any instructor can easily pre- 
pare similar tests for his own use. 

In addition to the tests in physics at Columbia Uni- 
versity referred to above, mention may be made of a 
standardized test in this subject intended for college 
as well as high-school use. This is the Columbia Ee- 
search Bureau Physics Test by Farwell and Wood.® 
It consists of 144 true-false statements. Some of these 
deal with particular problems given in the body of 
the test and others with more general information in 
the field of physics. 

Another general field in which not only college tests 
similar to those for high school may be used, but in 
which there are also tests designed particularly for 

iThis test is published by the Bureau of Public Personnel Administration, 
Mills Building:, Washington, D. 0. 

2 All of the Columbia Research Bureau tests may be secured from the World 
Book Company, Tonkers, New York. 



OBJECTIVE -TESTS IN INSTITUTIONS 


427 


college, is that of the foreign languages. The experi- 
mental work of Crawford and Eaynaldo (15) already- 
referred to in Chapter VII was carried on in part "with 
true-false and traditional examinations in French and 
Spanish grammar. The Columbia Eeseareh Bureau 
tests also include several in foreign language, there 
being one in French, another in German, and a third 
in Spanish. Each of these consists of three parts. The 
first, which deals -with vocabulary, is in multiple- 
answer form ; the second, dealing with comprehension, 
consists of true-false statements in the foreign lan- 
gfuage ; and the third, which tests knowledge of gram- 
mar, is in completion form. 

In English also there is a standardized test for col- 
lege use prepared by the Columbia Eeseareh Bureau. 
Part I of this test, in multiple-answer form, deals -with 
spelling; Part II presents a passage containing a 
number of errors in grammar, syntax, punctuation, 
capitalization, use of idioms, and construction, to be 
corrected; Part III is a vocabulary test in multiple- 
answer form ; Part IV, likewise in the same form, deals 
with literary knowledge. 

The Columbia Eeseareh Bureau, in addition to the 
several tests already mentioned, has made available 
tests in algebra and geometry. The former consists of 
two parts, the first of which deals with the mechanics 
of algebra, whereas the second contains written prob- 
lems calling for its application. The geometry test like- 
■wise contains two parts. The first is composed of true 
and false statements dealing with the subject-matter 
of geometry and the second of written problems 
therein. 

There has already been given in Chapter VIII a 
complete examination of the completion essay type in 



428 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

pMlosophy. New-type tests, however, are also being 
used in this subject and the allied one of logic. Eefer- 
ence has already been made to the Use of objective 
tests in philosophy at Columbia University. May (52 
and 53) has reported on the use of standardized ex^ 
aminations in logic at Syracuse University. He gives 
a number of illustrative exercises and presents some 
critical material. The exercises given include the mul- 
tiple-answer, true-false, rearrangement, and single- 
answer types. 

Of the subjects not strictly professional, and per- 
haps even of all subjects, it appears that psychology 
has more often employed new-type examinations than 
any other. Inasmuch as much of the work in psychol- 
ogy, particularly in educational psychology, is rather 
closely coimected with the profession of teaching, it is 
doubtful whether it should be included in this section 
among liberal arts subjects or in the next which deals 
with professional courses. It will, however, be consid- 
ered here. In the same experiment referred to just a 
page or two previously, Crawford and Raynaldo at 
the University of Idaho (15) employed true-false tests 
in psychology as well as in modern foreign language. 
Likewise Gates (28) in his experimental work at 
Teachers College, Columbia University, which has been 
referred to in Chapter YU, used new-type tests in both 
elementary and advanced courses in psychology. Most, 
but not all, of his tests were of the true-false type. As 
was indicated in Chapter VII, he found they possessed 
quite a number of advantages over essay examinations. 
Laird (44 and 45) has written concerning the use of 
single-answer tests at the University of Wyoming. He 
likewise compared them with the traditional type and 
reached conclusions decidedly in their favor. It ap- 



OBJECTIVE TESTS IN INSTITUTIONS 


429 


pears that at Colorado College, as reported by Eem- 
mers and others (69), the use of objective tests had 
reached such a point that experiments were carried 
on to determine which type of the new examination 
was the best. The true-false, multiple-answer, and 
completion were all employed. May (52 and 53), in his 
articles recently referred to, deals with objective tests 
in psychology as well as in logic. Four types were 
used, multiple-answer, true-false, analogies, and single- 
answer, and a number of definite advantages as com- 
pared with traditional examinations reported. Miller 
(57), at the University of Minnesota, prepared a test 
with more than two hundred true-false, completion, 
and multiple-answer items. He found that by taking 
certain precautions he was able to use the same test 
for ten quarters without noticeable increase in scores. 
In other words, it appeared that even though the same 
test was employed quarter after quarter, students were 
not able to prepare for it in such a way as to increase 
their scores unduly. 

Other examples of the use of objective tests in lib- 
eral arts subjects might be cited, but those already 
given appear to be sufficient to demonstrate that such 
tests may be employed in at least certain phases of 
all subjects and that they have already been widely 
enough used that they caimot be said to be in an ex- 
perimental stage any longer. It is true that much more 
experimentation and critical study are needed, but the 
information desired has to do with making the very 
best use of objective tests rather than with their merit 
when compared with traditional examinations. As can- 
not be said too often, each has its place, its merit, and 
its limitations, and the two together are much better 
than either one alone. 



430 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

3 . Objective tests in professional courses. Not only 
are objective tests suitable for use and actually being 
employed in liberal arts or general courses in our in- 
stitutions of higher learning but also they are finding 
a place in many professional schools. Indeed, the long- 
est test, in so far as the number of items is concerned, 
of which the writer has ever heard was reported as 
being used in a professional school. E. P. Wood ( 99 ) 
makes the statement that in several professional 
schools new-type exanoinations of from 250 to 500 ques- 
tions were employed in some of the basic courses, and 
that in one case she knew of a four-hour examination 
in a professional school which included 794 different 
items. Also the most complete series of new-type ex- 
aminations in any one subject or group of closely re- 
lated subjects with which the writer is familiar was 
developed for use in a professional school. This series 
will be described more completely a little later. 

As would undoubtedly be expected, the most exten- 
sive use of objective tests in any type of professional 
work is in connection with courses in education. This 
will be illustrated by referring to a number of such 
tests, some of which are more or less satisfactorily 
standardized whereas others are not. In addition to 
those mentioned here it is probable that several re- 
ferred to above in the discussion of tests in psychology 
could with equal propriety be included under educa- 
tion, since in many cases psychology, at least educa- 
tional psychology, may be regarded as a professional 
subject. Hannig ( 34 ) reports briefly a study made for 
the Board of Examiners of New York City. This in- 
cluded the use of a four-hour traditional examination, 
a one-hour true-false test and a one-hour completion 
test as parts of the general examination for applicants 



OBJECTIVE TESTS IN INSTITUTIONS 431 

for teaeliing positions in Grades lA to VTB. He con- 
cludes that a one-hour test of the new type is about 
as reliable as a four-hour traditional examination and 
recommends that some of each type be employed. 

Knight, aided by Franzen, prepared a professional 
test for elementary school teachers consisting of thirty- 
six subtests, most of which require knowledge ordi- 
narily gained in education courses. True-false, single- 
answer, multiple-answer, matching, and other types 
of exercises are employed therein. Later, a series of 
“Professional Tests for Elementary Teachers”® was 
prepared by Knight, Euch, Telford, and Bathurst 
(loo). Though partially aptitude tests these also deal 
to a large extent with knowledge actually acquired in 
courses in education. One of them consists of four 
parts dealing respectively with reading, arithmetic, 
spelling, and writing; the other has six parts: per- 
sonal judgment, theory and practice of teaching, read- 
ing comprehension, social information, school and class 
management, and professional information. 

A rather large number of new-type exercises in edu- 
cational measurement may be found in an article by 
Mead ( 54 ). They include examples of true-false, single- 
answer, completion, and other types of exercises. Their 
value and use are discussed. It appears that Mead 
has made use of many, if not all, of those given in 
his own teaching at Ohio Wesleyan University. 

A test which has been reserved for mention here 
though it might well have been included under psy- 
chology, has been prepared by Geyer ( 30 ), apparently 
chiefly for use at the Chicago Normal College. It con- 
sists of 90 true-false statements and 60 multiple- 

8 These tests may he secured Irom the Bureau of Public Personnel Administra- 
tion,. Hills BuildinjET, Washington, D. 0. 



432 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

answer exercises, and is based upon the Twenty-First 
Yearhooh of the National Society for the Study of 
Education, which dealt with intelligence tests.'* 

Still another test is that prepared by Weber « (go). 
It is called “A Standard Achievement Test on Aims, 
Purposes, Objectives, Attributes and Functions in Sec- 
ondary Education.” The first subtest is in completion 
form, the second is composed of multiple-answer ex- 
ercises, the third of multiple-reason exercises, and the 
fourth and fifth are again completion. The test has 
been used in a number of universities and other insti- 
tutions and may be said to be fairly well standardized. 
Another test which likewise deals with the field of 
secondary education has been prepared by the writer 
of this book.® It is a “Standard Achievement Test on 
Principles of Teaching in Secondary Schools” and 
consists of four subtests which are respectively of the 
matching, multiple-answer, incorrect-statements, and 
completion types. This has also been used widely 
enough so that fairly satisfactory norms are available. 
Both this and Weber’s test described just above are 
intended for rather specific testing of subject-matter 
taught in certain fundamental courses given prospec- 
tive secondary teachers. 

A test of an entirely different sort from any of those 
already mentioned, so far as its function is concerned, 
is the “Standard Achievement Test on An Introduc- 
tion to Education” by Frasier and Armentrout.'^ This 
test covers knowledge of a single book, An Introduc- 
tion to Education, by the two men just named. One 

<4 Geyer's test can be secured from the Plymouth Press, Chicago. Illinois. 

6 This is published by the Public School Publishing Company of Bloomington, 
Illinois. 

6 This test is also published by the Public School Publishing Company, Bloom- 
ington, Illinois. 

7 This test may be secured from Scott, Poresman and Company, Chicago. 



OBJECTIVE TESTS IN INSTITUTIONS 


433 


form includes true-false statements, matching tests, 
and multiple-reason exercises, whereas the other form 
is composed entirely of true-false statements. 

Despite the fact that there appears to have been a 
much wider use of the new examination in courses 
in education than in other professional courses, yet 
instances of their use in others are not wanting. Thus 
Wood (94, pp. 271-273) gives examples of several va- 
rieties of such exercises used in civil engineering at 
Columbia University. Many schools of engineering 
have employed more or less standardized and objec- 
tive tests but in almost all cases they have been tests 
of aptitude rather than of actual engineering achieve- 
ment. 

At the beginning of this section reference was made 
to what the writer stated was the most complete series 
of tests of the new type with which he was familiar. 
These tests were worked out for use at the Summer 
Library Institute of the American Library Associa- 
tion in 1926.® This series includes tests grouped under 
seven main heads as follows: 

Book selection 
Eeference work 
Library classification 
Lending methods 

School library administration and organization 
Children’s work 
How to use the library 

There are from three to ten different tests, including 
a total of from about seventy up to approximately 300 
exercises or items, undfer each head. Among the types 

8 Copies of these tests are not generally available but information concerning 
them can be secured by addressing Margaret B. Martin, G-eneral Assistant, Board 
of Education for Librarianship of the American Library Association, 86 East 
Bandolpb Street, Chicago. 



434 TEADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

of exercises employed are the true-false, the multiple- 
answer, the matching, the completion, and the single- 
answer. The whole series, mimeographed and bound 
together, covers hundreds of pages. 

Medicine is another professional field in which ob- 
jective tests are being used. Trabue ( 88 ) has discussed 
and described their use in this connection, giving a 
number of specimen exercises used at the College of 
Physicians and Surgeons of Columbia University, 
True-false statements, yes-no questions, multiple- 
answer exercises, and completion statements are all il- 
lustrated. The report of his address at the annual meet- 
ing of the Association of American Medical Colleges 
in 1925 is followed by a discussion participated in by 
quite a number of those present. It appears from this 
discussion that at least a fair proportion of those who 
spoke were in favor of making some use of this type 
of examination. 

In law, also, the new examination is being employed. 
Its use in the Columbia Law School has been described 
and illustrated at some length by Wood ( 96 ). He pre- 
sents both statistical evidence and opinions of in- 
structors to indicate the value of objective tests in 
professional legal courses. Person and Stoddard have 
prepared a “Law Aptitude Examination,”® some 
parts of which appear to test actual knowledge of legal 
principles and practice, although this is not true of 
most of the material which it contains. 

Although many, perhaps as yet even most, profes- 
sional schools have not modified the types of examina- 
tions which have been in use for many years, there ap- 
pears to be no reasonable doubt that the use of the 

9 This is published by the West PublishiusT Company of St. Paul, Minnesota. 



OBJECTIVE TESTS IN INSTITUTIONS 436 

new examination in such institutions is increasing. It 
is not entirely replacing discussion examinations, as 
indeed it should not, but is rather being used along 
with them. The same advantages are resulting from its 
use here as elsewhere, chiefly economy in time of giving 
and scoring, increase in reliability and in validity for 
the measurement of certain outcomes of instruction. 

4. Summary. Although the same general principles 
apply to the use of examinations in institutions of 
higher learning as in elementary and secondary 
schools, yet it seems well to give some specific atten- 
tion to the matter. In this chapter, therefore, men- 
tion is made of a number of both standardized and non- 
standardized objective tests which have been and are 
being employed in the various subjects found in col- 
lege and university curricula, both general and profes- 
sional. In practically all cases in which the same sub- 
jects are taught in college and in high school, tests of 
the same sort may be employed. Among the liberal 
arts subjects in which such tests have been employed 
are the various social sciences, the natural sciences, 
the foreign languages, English, mathematics, philos- 
ophy and logic, and psychology. Professional courses 
in which new-type tests have been employed are most 
frequently those in education, but they are also be- 
ing used in engineering, medicine, library work, and 
law. 




BIBLIOGEAPHT 




BIBLIOGRAPHY 1 
(Selected and annotated) 

The bibliography given below consists of one hun- 
dred titles selected from some five or six hundred ac- 
cumulated by the writer in the preparation of this 
volume. Those given include all to which references 
have been made in the text and also enough others 
selected from among those of unusual helpfulness to 
bring the number up to an even hundred. A brief anno- 
tation follows each so that the reader may gain a 
better idea of its contents than is possible from the 
title alone. 

1. AoHTBsrHAGErr, Olga, “Why is an Examination — 

and What of Itf' English Journo^, 15:285-289, 

April, 1926. 

A plea that teachers employ examinations so as to derive 
the possible benefits therefrom instead of as mere necessities 
or formalities. 

2. Ballard, P. B., The New Examiner. London: Hod- 

der and Stoughton, 1924. 269 pp. 

Following several chapters devoted to examination as meas- 
urement are one on the essay, several on the new examination 
and finally several containing more or less standardized tests 
in a number of subjects. 

1 It is probable that a more complete bibliography dealing with examinations 
and marks will be published during the coming year by the Bureau of Educa- 
tional Besearch of the University of Illinois. 

439 



440 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

3. Baedy, Joseph, “An Investigation of the Written 

Examination as a Measure of Achievement with 
Particular Reference to General Science.” Phila- 
delphia: University of Pennsylvania, 1923. 176 

pp. 

This includes a study of present practices with respect to 
examinations in high school, the account of an experiment 
with traditional and new-type tests in general science with 
conclusions drawn therefrom, and a number of both kinds of 
tests. 

4. BiACKHxmsT, J. H., “The Normal Curve as Related 

to High School and College Grading,” cmd 

Society, 13 : 447-450, April 9, 1921. 

An argument against the uncritical or general use of the 
normal curve to determine the distribution of grades. It is 
suggested that the upper half of the curve may be followed 
roughly. 

5. Bolton, F. E., “Do Teachers’ Marks Vary as Much 

as is Supposed?” Education, 48:23—38, Septem- 
ber, 1927. 

In this article the writer presents data to show that under 
ordinary circumstances teachers’ marks are rather reliable. 
Also he analyzes some of the data which Starch cites to show 
great variability and claims that they do not support Starch’s 
conclusions. 

6. Bbanom, M. E., The Measurement of Achievement 

in Geography. New York: The Macmillan Com- 
pany, 1925. 186 pp. 

This is a very helpful treatment of the subject, including 
discussions of the purpose, need and value of testing, many 
samples of tests with detailed information as to how to con- 
struct them, descriptions of standardized tests, a bibliography, 
and so forth. 



BIBLIOGRAPHY 


441 


7. Bbiitkijet, S. G-., “Values of New-Type Examina- 

tions in the High School with Special Eeference 
to History,” Teachers College, Columbia Univer- 
sity Contributions to Education, No. 161. New 
York: Teachers College, Columbia University, 
1924. 121 pp. 

In addition to the account of a carefully conducted experi- 
ment dealing with the merits and limi tations of five varieties 
of new-type and also of essay examinations in history there 
is a general treatment of examinations. The experiment 
described is one of the most helpful ones reported. 

8. BtmsoH, J. F., and Meltzee, H., “The New Exami- 

nation, Its Construction and Use,” School of Vo- 
cational Education, Oregon State Agricultural 
College Bulletin, No. 422. Corvallis : Oregon State 
Agricultural College, 1926. 40 pp. 

This discusses the characteristics of a good examination^ 
illustrates a dozen types of new examination questions, tells 
how to make and score a new examination, treats of its special 
uses and finally gives a number of tests in vocational sub- 
jects. On the whole it is one of the best treatments of the sub- 
ject. 

9. BuHiER, W. F., “The Value of Informal Tests in 

Supervision,” First Yearbook of the Department 
of Elementary School Principals. Washington: 
National Education Association of the United 
States, 1922. Pp. 95-119. 

Examples of tests constitute the bulk of his article. Quite 
a number of subjects are included and several of the examples 
illustrate rather unusual and suggestive types of tests. 

10. Caldwell, 0. W., and Coubtis, S. A., “Unanswer- 

able Arguments from the Past,” Then cmd Now 



•442 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

in Education: 1845-1923. Yonkers: World Book 
Company, 1924. Pp. 37-46. 

This chapter consists almost entirely of quotations from 
Horace Mann praising the methods used in the Boston School 
Survey in 1845 and especially the use of uniform -written ex- 
aminations in place of oral ones. 

11. Calkins, M. W., “Philosophers in Council,” 

School and Society, 17 : 316-320, March 24, 1923. 

An illustration of an unusual but ingenious, interesting and 
valuable type of discussion examination. It consists of the be- 
ginnings of a number of speeches or arguments -which are to be 
completed. 

12. Camp, P. S., “Some ‘Marks’: An Administrative 

Problem,” School Review, 25 : 697-713, Decem- 
ber, 1917. 

An account of the revision and standardization of a system’s 
marking plan, given in some detail. 

13. Chapman, J. C., “Individual Injustice and Guess- 

ing in the True-False Examination,” Journal of 
Applied Psychology, 6 : 342-348, December, 1922. 

The -writer shows by actual data the injustice done by using 
the right-minus-wrong scoring formula and concludes that 
there is grave doubt whether the merits of the true-false test 
outweigh its demerits. 

14. Christensen, A. M., “A Suggestion as to Correct- 

ing Guessing in Examinations,” Journal of Edu- 
cational Research, 14:370-374, December, 1926. 

The -writer su^ests that a true-false test be followed by a 
multiple-answer one and that a pupil’s score be the number of 
exercises correctly answered on both tests. 

15. Ceaweoed, C. C., and Eaynaldo, D. A., “Some Ex- 

perimental Comparisons of True-False Tests 



BIBLIOGRAPHY 


443 


and Traditional Examinations,” School Review, 
33 : 698-706, November, 1925. 

Fifteen out of twenty different comparisons indicate that 
the traditional examination is more reliable than the true-false 
tests, but in some situations the latter appears to be decidedly 
better. The use of both is recommended. 

16. Dadoueian, H. M., “Are Examinations Worth the 

Price?” School and Society, 21:442-443, April 
11, 1925. 

The writer reaches the conclusion that the results supposed 
to be obtained from examinations in college would be more 
effectively secured if examinations were abolished. 

17. D]eaiiborit, W. F., “School and University 

Grades,” Bulletin of the University of Wiscon- 
sin, No. 368. High School Series, No. 9. Madison: 
University of Wisconsin. June, 1910. 59 pp. 

This is one of the earliest important studies of the subject. 
It deals with the distribution of ability, inequalities in marks, 
marks in school subjects and in the university, and so on. 

18. Douglass, H. E., “Quizzes, Examinations and 

Marking,” and “New Ideas in Written Exami- 
nations,” Modern Methods m High School 
Teaching. Boston: Houghton MiflElin Company, 
1926. Chapters XII and XIV. 

Both of these chapters contain practical, clear helps for 
teachers and others interested. They are well illustrated with 
concrete examples and followed by short, but good, bibliog- 
raphies. 

19. Douglass, H, E., and Spbnoeb, P. L., “Is it Neces- 

sary to Weight Exercises in Standard Tests?” 
Journal of Educational Psychology, 14: 109-112, 
February, 1923. 



444 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

The writers present data which tend to show that the weight- 
ing of exercises is unnecessary, since the correlation between 
weighted and unweighted scores is very high. 

20. Doyle, Lillun, and Foote, Marie, “The Pledge as 

an Instrument to Secure Honesty in Examina- 
tions, ’ ’ Peabody J ournal of Education, 3 : 79-84, 
September, 1925. 

A study, based on data from one high school, leads the 
authors to conclude that the use of the pledge is undesirable. 

21. Du Beeuil, a. J., “True-False Test in Literature 

and Formal English,” Illinois Association of 
Teachers of English Bulletin, 15 : 1-17, May 1, 
1923. 

Following a brief discussion are several hundred true-false 
exercises covering various periods of English and American 
literature and a few commonly studied works. 

22. Fabwell, H. W., “The New-Type Examinations in 

Physics,” School and Society, 19:315-322, 
March 15, 1924. 

This discussion of new-type tests includes examples of sev- 
eral forms of tests and states a number of points learned 
through their use. 

23. Finkelstein, I. E., “The Marking System in The- 

ory and Practice,” Educational Psychology 
Monographs, No. 10. Baltimore: Warwick and 
York, 1913. 83 pp. 

This is a study of the distribution of marks at Cornell Uni- 
versity, showing general tendencies and variations among in- 
structors. It is recommended that a five-point system with 
approximately fixed per cents in each group be used. 



BIBLIOGRAPHY 


445 


24. Fostek, H. D., “Adequate Teats in History,” His- 

tory Teacher’s Magazine, 5:116-123, April, 
1914. 

The writer reports a study of many questions used in college 
entrance examinations, giving the ones selected as most and 
least adequate according to the judgments of both pupils and 
readers. 

25. Foster, R. R., and Ruch, G. M., “On Corrections 

for Chance in Multiple-Response Tests,” Jour- 
nal of Educational Psychology, 18:48-51, Janu- 
ary, 1927. 

Data based on testing about two thousand pupils show 
higher correlation with the criterion when wrongs and omis- 
sions are also taken account of than when rights alone are 

W 

used, but indicate that Score = E — ]^ _ j] overpenalizes 
slightly. 

26. French, H. P., “A Practical Method of Translat- 

ing Objective Scores into Percentage Marks,” 
J owned of Educational Method, 6:60-61, Octo- 
ber, 1926. 

A short but clear explanation of an easy method of convert- 
ing scores into percentage marks. 

27. Fritz, M. F., “Guessing in a True-False Test,” 

Journal of Educational Psychology, 18: 558-561, 
November, 1927. 

The writer cites data which show the tendency to answer 
true-false tests aflBrmatiely oftener than negatively and there- 
fore concludes that if wrong answers are to affect the scores 
such tests should contain more afiEirmative than negative state- 
ments. 

28. Gates, A. I., “The True-False Test as a Measure 



446 TRADlTlONiCL EXAMINATIONS AND NEW-TYEE TESTS 

of AcMevement in College Courses,” Journal of 
Educational Psychology, 12: 276-287, May, 1921. 

Data based on testing over 600 college students are given. 
True-false tests are shown to be more reliable than essay ex- 
aminations, also more valid ; to correlate more highly with in- 
telligence scores and finally are said to possess many other 
advantages. 

29. GrATHANT, J. M., “The Giving of History Examina- 

tions,” Education, 34: 514-521, April, 1914. 

The writer offers a number of adverse criticisms of the 
usual methods of giving history examinations, especially of 
keeping the pupils from knowing what the questions are to be 
and advocates that the exact questions be given a week or 
more in advance, supporting this view by several arguments- 

30. Geyeb, D. L., “A Uniform Objective Examination 

on Intelligence Testing,” Journal of Educational 
Psychology, 14:373-375, September, 1923. 

This is a brief description of a test composed of true-false 
and multiple-answer items. 

31. Gbat, W. S., et cd., “Informal Reading Tests,” 

Tioenty-Fourth Yearbook of the National So- 
ciety for the Study of Education, Part I. Bloom- 
ington, Illinois: Public School Publishing Com- 
pany, 1925. Pp. 233-264. 

These pages contain a number of concrete and helpful sug- 
gestions as to how any teacher may construct tests in this field. 

32. Geeene, C. E., “New-Type Tests,” Research Mon- 

ograph, No. 3. Denver: Public Schools, 1926. 
35 pp. 

This monograph contains a brief but helpful discussion of 
the merits and demerits of essay and new-type examinations ; 
also of the construction and use of the latter ; followed by ex- 



BIBLIOGRAPHY 


447 


amples of six types thereof and of other types adapted to vari- 
ous school subjects. 

33. Hammond, E. L., “A Study of the Eeliability of 

an Objective Examination in Ninth-Grade Eng- 
lish,” School Review, 35: 45-51, January, 1927. 

A study of the reliability of three types of objective tests 
in English did not show that any one of them was definitely 
more reliable than the others, but did indicate that they were 
more reliable than essay examinations. 

34. Hannig, W. a., “The Eelative Worth of Short- 

Answer and Free-Answer Material in Elemen- 
tary Teacher Tests,” Public Personnel Studies, 
4: 277-278, October, 1926. 

A study made for the Board of Examiners of New York 
City indicated that a one-hour short answer or new-type ex- 
amination gave results about as reliable as a four-hour free- 
answer or essay examination. It is not, however, recommended 
that the latter be entirely displaced by the former. 

35. Hayhxtbst, E. E., “How to Write an Examina- 

tion.” Columbus, Ohio: Ohio State University 
Cooperative Supply Company, 1922. 31 pp. 

This is a brief and practical list of forty “pointers,” in- 
tended for students and all others who may be taMng examina- 
tions. Little attention is given to theory, though reasons are 
stated. 

36. Hopkins, L. T., “The Construction and Use of 

Objective Examinations.” Boulder: College of 
Education, University of Colorado, 1926. 119 pp. 

Brief discussions of true-false, completion and multiple- 
choice tests are followed by long examples of these three types 
in about fifteen school subjects, most of which belong in the 
high school. 

37. Hulten, C. E., “The Personal Element in Teach- 



448 TBABinONAL EXAMINATIONS AND NEW-TYPE TESTS 

ers’ Marks,” Journal of Educational Research, 
12 ; 49-55, June, 1925. 

A study indicates that teachers are not consistent in giving 
high or low grades, therefore that no uniform correction can 
be applied to make the marks of different teachers comparable. 
Also it is shown that they are too unreliable to be used for 
determining promotion. 

38. Hunter, 0. B., and Moss, F. A., “Standardized 

Tests in Bacteriology,” Public Personnel 
Studies, 3 : 52-66, February, 1925. 

The tests are given in full and accompanied by a rather 
lengthy discussion. 

39. Johnson, F. W., “The Marking System,” The Ad- 

ministration cmd Supervision of the High 
School. Boston: Ginn and Company, 1925. Chap- 
ter XV. 

After showing the variability of marks Johnson discusses 
their meaning, distribution, the symbols to be used and other 
pertinent topics. 

40. Johnson, F. W., “A Study of High School 

Grades,” School Review, 19:13-24, January, 
1911. 

This gives data showing the variations in grades issued by 
different teachers and departments. It was the earliest study 
of this sort to attract general attention. 

41. Ejedly, F. J., “Teachers’ Marks, Their Variabil- 

ity and Standardization,” Teachers College, Co- 
lumbia University Contributions to Education, 
No. 66. New York: Teachers College, Columbia 
University, 1914. 139 pp. 

This very notable study deals with marking standards in 
elementary and high schools and colleges, also the marking of 



BIBLIOGRAPHY 


449 


examination papers. The great variability and unreliability of 
marks as given are clearly shown. 

42. Kinder, J. S., ^^Supplementing our Examina- 

tions/’ Education, 45:557-566, May, 1925. 

This includes directions for making, giving and scoring 
true-false tests, data on the validity and reliability of several 
kinds of examinations and a list of the advantages of the new 
examination. 

43. Kolstoe, S. 0., ‘^Eeactions to True-False Tests,” 

School of Education Record of the University of 
North Dakota, 11 : 54-55, April, 1926. 

Eesponses from almost 300 students in a teachers college 
showed that only one-ninth preferred the essay type, also that 
one-fourth felt that true-false scores were not satisfactory. 

44. Laird, D. A., ^‘A Comparison of the Essay and the 

Objective Type of Examinations,” Journal of 
Educational Psychology, 14:123-124, February, 
1923. 

A comparison of an essay and an objective examination over 
the same limited subject-matter showed that the latter se- 
cured more complete responses and that the correlation be- 
tween the two was almost zero. 

45. Laird, D. A., ^‘A Note on the Shortening of Exam- 

inations,” Journal of Educational Psychology, 
15 : 116-117, February, 1924. 

This reports a study indicating that a twenty-question ob- 
jective test is almost as reliable as an eighty-question one and 
that the correlation between students’ estimates of their 
marks and actual marks on an objective test is low. 

46. Lowell, A. L., ‘^The Art of Examination,” Atlan- 

tic Monthly, 137:58-66, January, 1926. Also in 
The Work of the College Entrance Examination 



460 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

Board, 1901-1925. Boston: Ginn and Company, 
1926. pp. 31-43. 

This is a very forceful statement of three chief purposes 
or values of examinations: that they measure achievement, 
serve as direct means of education, and set standards. 

47. Mo Apes, L. 0., “The Eeliability of Non-Standard- 

ized Point Tests,” Elementary School Journal, 
24: 579-585, April, 1924. 

Data are given which tend to show that point or objective 
tests are more valid than discussion tests and single-answer 
tests slightly more so than yes-no ones. 

48. McCall, W. A., How to Measure in Education. 

New York: The Macmillan Company, 1922. Pp. 
119-133 and 193-318. 

The first reference contains a discussion of the use of in- 
formal examinations, centering it about the true-false test, 
with which it deals in some detail. The second one was written, 
with standardized tests in mind, but contains some sugges- 
' tions and principles applicable to informal testing. 

49. McCall, W. A., “A New Kind of School Exami- 

nation,” Journal of Educational Research, 
1 ; 33-46, January, 1920. 

This article, which deals with true-false tests, was the first 
to attract general attention to the so-called “new examina- 
tion.” It describes the method of making, giving and scoring 
such tests. 

50. Marsh, W. E., et al.. Report of the Commission 

on New Types of Examinations, etc. New York: 
College Entrance Examination Board, 1923. 
39 pp. 

This includes the commission’s report citing some evidence 
favorable to the new types, a review of this evidence which 



BIBLIOGRAPHY 


451 


points out that it has little weight, and several other minor 
reports. 

51. Masters, H. G., “Standards for Eating Pupils,” 

Journal of Educational Method, 1 : 176-177, Jan- 
uary, 1922. 

The use of a rating scheme based upon descriptions and 
statements covering four phases of work and having five letter 
ratings is outlined. 

52. May, M. A., “Measuring AeMevement in Elemen- 

tary Psychology and in Other College Sub- 
jects,” School and Society, 17 : 472-476 and 556- 
560, April 28 and May 19, 1923. 

A supplement to May’s earlier article. This contains re- 
liability and validity coefficients and other critical material. 

53. Mat, M. A., “Standardized Examinations in Psy- 

chology and Logic,” School and Society, 
11 : 533-540, May 1, 1920. 

This article illustrates and discusses the application of the 
technic of standardized tests to examinations in psychology 
and logic. 

54. Meab, a. E., “Suggestions for the Training of 

Teachers in the Use of Educational Measure- 
ments,” Educational Administration and Super- 
vision, 12:23-43. January, 1926. 

A large portion of this article is devoted to an examination 
which includes a number of varieties of new-type tests. 

55 Miller, G. F., “Formulas for Scoring Tests in 
Which the Maximum Amount of Chance is De- 
termined,” Journal of Educational Psychology, 
16: 304-315. May, 1925. 

Miller points out a fault in using the number of right an- 
sw^s as the scorb if the' maxitaum amount of chance is known 



432 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

and shows how it may be easily corrected hy using a formula 
which he gives. 

56. Miller, G. F., “Objective Tests in High School 

Subjects.” Norman, Oklahoma: G. F. Miller, 
1926. 168 pp. 

This contains the largest number of actual objective tests 
found in any publication with which the writer is acquainted. 
Following a brief general discussion are twelve to eighteen, 
pages of objective exercises in each of nine high-school sub- 
jects. Various types are represented under each subject. 

57. Miller, W. S., “An Objective Test in Educational 

Psychology,” Journal of Educational Psychol- 
ogy, 16: 237-246, April, 1925. 

A psychology test of 210 true-false, multiple-choice and 
completion items was used for ten quarters. The results cor- 
related .55 with average honor points, .37 with intelligence 
test scores, and so forth. 

58. Monroe, W. S., “Measurement of Achievement, 

General Principles,” and “The Improvement of 
Measurement Procedures,” Directing Learning 
in the High School. Garden City: Doubleday, 
Page & Company, 1927. Chapters XV and XVI. 

A discussion of varieties of measurement and the general 
principles thereof is followed by statements of the merits 
and limitations of different types of tests and suggestions for 
constructing and administering examinations and for marking. 

59. Monroe, W. S., “Written Examinations and their 

Improvement,” University of Illinois Bulletin, 
Vol. 20, No. 7, Bureau of Educational Research 
Bulletin, No. 9. Urbana: University of Illinois, 
1922. 71 pp. 

This contains a brief but very helpful discussion of the 
arguments for and against ordinary written examinations, 



BIBUOGKAPHY 


453 


some suggestions for their improvement and a number of ex- 
amples of both new-type and traditional examinations. 

60. Mosteob, W. S., and Oajbtee, E. E., “The Use of 

Different Types of Thought Questions in Secon- 
dary Schools and Their Eelative Difficulty for 
Students,” University of Illinois Bulletin, Vol. 
20, No. 34, Bureau of Educational Research 
Bvdletin, No. 14. Urbana: University of Illinois, 
1923. 26 pp. 

Twenty different types of thought questions are defined and 
their use described. 

61. Monboe, W. S., and Soudebs, L. B., “The Present 

Status of Written Examinations and Sugges- 
tions for their Improvement,” University of 
Illinois Bulletin, Vol. 21, No. 13, Bureau of Edu- 
cational Research Bulletin, No. 17. Urbana: 
University of Illinois, 1923. 77 pp. 

In this will be found a summary of criticisms of examina- 
tions, a comparison of ordinary examinations and standard- 
ized tests, based upon actual data, a report of a study of ex- 
amination practices in Illinois high schools, suggestions for 
improving examinations and samples of the more objective 
varieties. 

62. Mobley, E. E., “Final Examinations and the 

Effect of Exemptions,” High School Teacher, 
2 : 90-91, March, 1926. 

The account of an experiment with exemptions which con- 
vinced Merely of their good results in motivating work. 

63. Odell, 0. W., “High School Marking Systems,” 

School Review, 33 : 346-354, May, 1925. 

A summary of the marking systems in about 300 Illinois 
high schools, showing that about 100 different plans are in use. 



454 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

64. Odeld, C. W., “Objective Measurement of Infor- 

mation,” University of Illinois Bulletin, Vol. 23, 
No. 36, Bureau of Educational Research Circu- 
lar, No. 44. Urbana : University of Illinois, 1926. 
27 pp. 

Thirty-seven varieties of objective or near-objective tests 
are illustrated and discussed briefly. There is also a short gen- 
eral treatment of such tests. 

65. Opdyke, J. B., “Constructive Examinations,” 

Educational Review, 73 : 33-43, J anuary, 1927. 

In this helpful article the writer discusses short tests, setting 
the tests, evaluating questions, good and bad questions, se- 
quence and continuity in examinations, rating answers and 
the relation of examinations to the school organization. It is 
one of the best general treatments of the subject. 

66. Patbbson, D. G-., Preparation and Use of New- 

Type Examinations. Yonkers : World Book Com- 
pany, 1925. 87 pp. 

This is one of he best published discussions of this subject. 
The principles underlying adequate examinations are pre- 
sented, eight kinds of new-type questions are discussed and 
illustrated, rather full directions for constructing, administer- 
ing and scoring them are given, and Anally a fairly long 
annotated bibliography is included. 

67. Patbeson, D. G., and Langlie, T. A., “Empirical 

Data on the Scoring of True-False Tests,” Jour- 
nal of Applied Psychology, 9 : 339-348, Decem- 
ber, 1925. 

On the basis of experimental data title writers conclude that 
the right-minus-wrong method of scoring lowers reliability and 
probably does not increase validity; furthemore that the 
true-false test often does not measure with high validity. 



BIBLIOGRAPHY 


455 


68. Ebedek, J. C., “The Genesee Scale of Qualities,” 

Elementary School Journal, 20: 292-296, Decem- 
ber, 1919. 

A scale defining the quality of work denoted by each of the 
five letter marks used at Qeneseo is described. 

69. Ebmmees, H. H., Maesohat, L. E., Beown, Ade- 

laide, and Chapman, Isabella, “An Experi- 
mental Study of the Eelative Difficulty of True- 
False, Multiple-Choice and Incomplete- Sentence 
Types of Examination Questions,” Journal of 
Educational Psychology, 14:367-372, Septem- 
ber, 1923. 

Experimentation with a limited number of subjects indi- 
cates that true-false and completion tests are more difficult 
than multiple-choice ones. 

70. Ebmmees, H. H., and Eemmees, E. M., “The Nega- 

tive Suggestion Effect of True-False Examina- 
tion Questions,” Journal of Educational Psy- 
chology, 17 : 52-56, January, 1926. 

An experiment with 136 paired pupils indicated that nega- 
tive carry-over effects from true-false exercises need not be 
feared. 

71. Extoh, G. M., The Improvement of the Written Ex- 

amination. Chicago: Scott, Foresman & Com- 
pany, 1924. 193 pp. 

This deals with several of the most often used types of ob- 
jective tests, discussing their construction, use, relative merits, 
reliability, validity, etc. Some attention is also given to writ- 
ten examinations in general, the unreliability of marks, and 
related topics. On the whole this is one of the best treatments 
of the subject so far published. 

72. Extoh, G. M., and DbGrabp, M. H., “Corrections 



456 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

for Chance and ‘Gness’ vs. ‘Do not Guess’ In- 
structions in Multiple-Response Tests,” Journal 
of Educational Psychology, 17 : 368-375, Septem- 
ber, 1926. 

A short but excellent treatment of the subject. Eleven con- 
clusions, based upon experimental evidence, are given. Cor- 
rections for chance are shown to have both desirable and un- 
desirable effects. 

73. Ettoh, G. M., and Stoddard, G. D., “Informal Ob- 

jective Examination Methods,” Tests and Meas- 
urements in High School Instruction. Yonkers: 
World Book Company, 1927. Part III. 

The authors discuss the place of such methods, the un- 
reliability of traditional examinations and the validity and re- 
liability of several types of tests. Examples, advantages and 
disadvantages of the common types of objective tests are 
given. 

74. Rxjch, G. M., et al., Objective Examination Methods 

in the Social Studies. Chicago : Scott, Foresman 
& Company, 1926. 116 pp. 

This contains several studies dealing with the reliability of 
traditional and new-type tests, the relative merits of several 
varieties of the latter, and other related topics. 

75. Edgg, H. 0., “Teachers’ Marks and Marking Sys- 

tems,” Educational Administration a/nd Super- 
vision, 1 : 117-142, February, 1915. 

A rather comprehensive summary of the studies and 
writings of others, with additional material presented by 
Rugg. 

76. Ettgo, H. 0., “The Teachers’ Use of Statistical Dis- 

tributions in Giving School Marks,” A Primer 



BIBLIOGRAPHY 


457 


of Graphics and Statistics for Teachers. Boston : 
Houghton Mifflin Company, 1925. Chapter VI. 

The writer shows why the marking system commonly used 
needs to be reconstructed, discusses what marks really meas- 
ure, and suggests a program for a new system. 

77. Eitssell, Charles, Classroom Tests. Boston ; Ginn 

and Company, 1926. 346 pp. 

This book contains full discussions of the construction and 
use of several of the new types of tests. Much attention is 
given to statistical methods of determining difficulty and of 
making composite scores. 

78. Spence, E. B., “The Bnprovement of College 

Marking Systems,” Teachers College, Columbia 
University Contributions to Education, No. 252. 
New York: Teachers College, Columbia Univer- 
sity, 1927. 89 pp. 

This presents a somewhat original scheme for giving col- 
lege marks which could he used elsewhere as well. It is based 
on McCall’s T-score method. 

79. Staboh, Daniel, “Can the Variability of Marks be 

Eedueed?” School and Society, 2:242-243, Au- 
gust 14, 1915. 

Several possible means of reducing the variability of teach- 
ers’ marks are suggested. 

80. Starch, Daniel, “Marks as Measures of School 

Work,” and “A Sample Survey of the Marking 
System in a High School,” Educational Measure- 
ments. New York: The Macnoiillan Company, 
1916. Chapters H and III. 

In these chapters are data. showing the unreliability of 
marks, with some constructive criticism concerning the mark- 
ing system. 



458 TEADITIONAIi EXAMINATIONS AND NEW-TYPE TESTS 

81. Staeoh, Daniel, and Elliott, E. C., “Reliability of 

Grading "Work in History,” School Review, 21: 
676-681, December, 1913. 

82. Starch, Daniel, and Elliott, E. C., “Reliability 

of Grading Work in Mathematics,” School Re- 
view, 21: 254r-259, April, 1913. 

83. Starch, Daniel, and Elliott, E. 0., “Reliability of 

the Grading of High School Work in English,” 
School Review, 20 : 442-457, September, 1912. 

These three articles, which compose a unified series, prob- 
ably did more than any other single stimulus to arouse the 
interest in marking systems and the belief in the need for 
their greater reliability which has manifested itself in the last 
two decades or less. 

84. Stoemzand, M. J., “American History Teaching 

and Testing.” New York: The Macmillan Com- 
pany, 1926. 181 pp. 

Following a short discussion of supervised study and related 
topic is a treatment of the new examination, more than 100 
pages of tests on Beard and Bagley’s The History of the 
American People and about twenty pages of topics for recita- 
tions on the same book. This material should be very helpful 
to teachers of history. 

85. Symonds, P. M., Measurement in Secondary Edu- 

cation. New York: The Macmillan Company, 
1927. Chapters I, H, III, XXIV and XXV. 

These chapters deal with the purposes of measurement, the 
unreliability of examination marks, the improvement of ex- 
aminations, the construction of new-type tests, marking sys- 
tems and the use of tests for instructional purposes. 

86. Thoma, W. M., “Committee Marking of Examina- 

tions,” Bulletin of High Points, 6: 26-28, Febru- 
ary, 1924. 



BIBLIOGRAPHY 


459 


The writer believes that the committee marking plan does 
not produce marks enough better than those of a single in- 
dividual to balance the disadvantages and bad results. 

87. Toops, H. A., Trade Tests in Education,’’ Teach- 

ers College, Columbia University Contributions 
to Education, No. 115. New York: Teachers Col- 
lege, Columbia University, 1921. pp. 39-62. 

A comparison of recall, multiple-answer and true-false tests 
given to 124 individuals indicated that their validity and 
reliability both decreased in the order named if equal num- 
bers of exercises were used, but that when equal amounts 
of time were employed the reliability was almost the same. 
There are also a general discussion of new-type tests and a 
number of examples. 

88. Tbabue, M. R., et al., ‘‘Increasing the Usefulness 

of Examinations,” Proceedings of the Thirty- 
Sixth Annual Meeting, Association of American 
Medical Colleges, 1925. Pp. 31-53. 

Trabue’s presentation of new-type tests, illustrated with 
exercises in medical subjects, is followed by general discussion 
by a number of those present. 

89. Webb, H. H., “A Preliminary Test in Chemistry,” 

Journal of Educational Psychology, 10:36-^3, 
January, 1919. 

An account of a semi-standardized test in chemistry which 
required the classification of a mixed list of elements, mix- 
tures, and compounds. 

90. Weber, J. J., “Achievement Test for Secondary 

Teacher Training,” High School Teacher, 3: 84r- 
85, March, 1927. 

• This article contains a description of the test, a few critical 
data concerning it, and some general discussion. 

91. Weibemahh, C. C., “How to Construct the True- 



460 TKADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

False Examination,’^ Teachers College, Colum- 
bia University Contributions to Education, No. 
225. New York : Teachers College, Columbia Uni- 
versity, 1926. 118 pp. 

This is the most complete treatment of this one type of test 
in print. Present practice and desirable modifications in both 
form and content are considered and many helpful sugges- 
tions given. 

92. Weld, L. D., “A Standard of Interpretation of 

Numerical Grades,” School Review, 25:412- 
421, June, 1917. 

This suggests a plan of changing the grades given by all 
teachers to the same basis. 

93. Whitten, C. W., “Report on Standardizing Teach- 

ers’ Marks,” Siooth Yearbook of the National 
Association of Secondary School Principals. 
Menasha, Wisconsin: George Banta Publishing 
Company, 1922. Pp. 183-202. 

Following a summary of a questionnaire study of practices 
are recommendations that an A, B, C, D, E, system be used 
and all reference to per cents abandoned, a detailed statement 
of what shall be required for each mark and a selected an- 
notated bibliography. 

94. Wood, B. D., Measurement in Higher Education. 

Yonkers: World Book Company, 1923. Chapters 
VIII to xm. 

These six chapters contain a rather full account of the use 
of the “new examination” in many courses at Columbia 
University. Lengthy examples are included and many data on 
its reliability and validity are given. 

95. Wood, B. D., “The Measurement of College 

Work,” Educational Administration and Super- 
vision, 7 : 301-334, September, 1921. 



BIBLIOGRAPHY 


461 


This is the report of the use of the new examination in the 
Contemporary Civilization coarse at Columbia University, ac- 
companied by general discussions of objectivity in measure- 
ment and school marks. 

96. Wood, B. D., “The Measurement of Law School 

Work,” Columbia Lem Review, 24:224-265, 
March, 1924. 

Wood deals with the meaning, basis, distribution and cor- 
relation of marks given in the Columbia Law School, illus- 
trates and describes the use of new-type tests there, and pre- 
sents evidence and opinions in their favor. 

97. Wood, B. D., “The Eegents’ Experiment with 

New-Type Examinations in French, Spanish, 
German and Physics,” New York Experiments 
with New-Type Modern. Language Lists. New 
York: The Macmillan Company, 1927. Part II, 
pp. 105-319. 

This gives a detailed account of the use of new-type tests 
as part of the New York Regents’ Examinations and compares 
the results therefrom with those from Regents’ Examinations 
of the old type. The new-type ones are shown to be decidedly 
superior. 

98. Wood, B. D., “Studies of Achievement Tests: I. 

The R versus the R — W Method of Scoring ‘Do 
not Guess’ True-False Examinations. II. The 
‘Internal Constitution’ versus the ‘External 
Form’ of Examination Questions,” Journal of 
Educational Psychology, 17 : 1-22 and 125-139, 
January and February, 1926. 

Wood presents data based on several tests in various sub- 
jects which indicate that the R— W method of scoring is more 
valid but less reliable than merely using the number right. 
Also he discusses the form of tests and recommends instruc- 
tions against guessing. 



462 TRADITIONAL EXAMINATIONS AND NEW-TYPE TESTS 

99. Wood, E. P., “Improving the Validity of Collegiate 

Achievement Tests,” Journal of Educational 
Psychology, 18:18-25, January, 1927. 

This reports a comparison of true-false, multiple-response, 
completion and essay tests ■which indicates that the completion 
test has slightly higher validity than the true-false and 
multiple-ans-wer and all three much greater validity than the 
essay type. 

100. “Standardized Tests for the Elemen- 

tary Teacher,” Public Personnel Studies, 
4:279-298, October, 1926. 

This deals ■with the tests constructed by Knight, Buch, 
Bathurst and Telford. Form A of both the Aptitude and the 
Subject Placement tests is given in full; also considerable 
statistical and critical material. 



INDEX 


A 

Abolition of examinations, 8, 18. 

Accumulation of exercises, 61, 244. 

Achievement quotient, 124. 

Achtenhagen, Olga, 210, 439. 

Adjusting teachers* marks, 164, 
172. 

Administration of examinations, 
64. 

Advantages of discussion examina- 
tions, 175. 

Advantages of examinations pre- 
pared by teacher, 27. 

Advantages of new examinations, 
183. 

Advantages of standardized tests, 
19. 

Agreement with objectives, 12. 

Agriculture exercises and tests, 
208, 290, 305, 368. 

Algebra tests, 311, 373, 383, 420, 
427. 

Alternative tests, 334. Also see 
true-false tests, yes-no tesis. 

Alternative tests with tliree an- 
swers, 354. 

American history exercises and 
tests, 207, 208, 209, 221, 234, 
303, 317, 319, 413, 418. 

American Library Association, 433. 

Analogies tests, 416. 

Ancient history tests, 224, 266, 
314, 353, 370. 

Announced tests, 55. 

Announcing scores, 73. 

Anonymous scoring, 94. 


Answering questions, 60. 
Arithmetic tests, 232, 263, 307, 
315, 379, 301. 

Armentrout, W. D., 432. 
Arrangement of test items, 240, 
377. 

Art tests, 374. 

Association tests, 203, 295, 313. 

B 

Bacteriology test, 425. 

Ballard, P. B., 4, 189, 330, 439. 
Bardy, Joseph, 194, 440. 

Basis of marking, 113, 159. 
Bathurst, J. E., 431, 

Beechview — Beechwood standards 
for rating pupils, 110. 
Best-answer tests. See multiple- 
answer tests. 

Bibliography, 430. 

Biology tests, 319, 379, 404. Also 
see botany tests, zoology tests. 
Bluckhurst, J. H., 158, 440. 
Bluffing, 15, 53, 202. 

Bolton, F. E., 7, 440. 

Bookkeeping tests, 259. 

Botany exercises and tests, 208, 
205, 382, 393- 
Branom, M. E., 440. 

Brinkley, .S. G., 32, 194, 196, 441. 
Brown, Adelaide, 455. 

Bursch, J. F., 441. 

Butler, W. F., 441. 


Caldwell, 0. W., 441. 
Calkins, M. W., 212, 442. 


463 



404 


INDEX 


Camp, F. S., 132, 442. 

Carter, R. E., 205, 453. 

Catch questions, 47. 
Cause-and-effect tests, 303. 
Changing scores into marks, 97. 
Chapman, Isabella, 455. 

Chapman, J. C., 342, 442. 
Cheating, 15, 74, 181. 

Chemistry exercises and tests, 

208, 202, 350, 384, 420. 

Chicago Normal College, 431. 
Choice of questions, 48, 71. 
Christensen, A. M., 180, 346, 442. 
Civics exercises and tests, 208, 

209, 228, 235, 329, 355, 365. 
Classification tests, 304, 313. 
Coeflicient of alienation, 43. 
Coefficient of correlation, 24. 
Coefficient of reliability, 24, 43. 
Coefficient of validity, 40. 
Collecting exercises. See accumu- 

lation of exercises. 

Colleges. See institutions of higher 
learning. 

College Entrance Examination 
Board Commission on New 
Types of Examinations, 422. 
College entrance examinations, 
422. 

Colorado College, 429. 

Columbia Research Bureau tests, 
426, 427. 

Columbia University, 425, 428, 
433, 434. 

Commercial arithmetic tests, 2(58, 
302. 

Commercial geography exercises 
and tests, 207, 264, 390, 399. 
Commission on New Types of Ex- 
aminations, 422. 

Committee system of marking, 92. 
Comparison of achievement with 
capacity, 123, 163. 

Complete enumeration tests, 272. 
Completion essay examination, 
211 . 


Completion tests, 358. 

Completion tests with suggested 
answers, 366. 

Compound matching tests, 384. 

Compound multiple-answer tests, 
316. 

Compound single-answer tests, 
277. 

Confusing efTect of tests, 181, 252, 
282, 335, 389. 

Conditions, 138. 

Constant errors, 26. 

Construction of new examinations, 
237. 

Content of examinations, 54, 240. 

Continuity tests, 405. 

Cooking tests, 262, 306, 408, 415. 

Correcting scores for guessing, 
287, 330. 

Correlation, 24. 

Courtis, S. A., 441. 

Cramming, 13, 53. 

Crawford, C. C., 191, 427, 428, 442. 

Criterion measures, 46. 

Criticisms of examinations, 4, 9. 

D 

Dadourian, H. M., 32, 443. 

Daily marking, 129. 

Dearborn, W. F., 6, 443. 

Defense of examinations, 11. 

Definition tests, 207, 301, 

DeGraff, M. II., 287, 336, 339, 340, 
455. 

Description teats. See definiiion 
iesiSj also miiUiplc-descripHon 
tests. 

Determination of standards, 36, 
40. 

Diagnosis, 36, 

Dictionary test, 411, 

Difficulty of tests, 48, 243, 251, 
256, 284, 339, 377. 

Directions to pupils, 70, 238, 285, 
297, 361. 



INDEX 


465 


Discussion examinations, 22, 175, 
205. 

Distinguishing tests, 403. 

Distribution of marks, 141. 

Double marking systems, 120. 

Douglass, H. R., 84, 188, 443. 

Doyle, Lillian, 444. 

Du Breuil, A. J., 444. 

Duplicate tests, 203. 

E 

Ease and economy of scoring, 200, 
24G, 250, 259, 291, 303, 307. 

Ease of constructing tests, 170, 
201, 250. 

Economies exercises and tests, 
208, 308, 404. 

Economy of time, 20, 47, 200, 371. 

Educational quotient, 124. 

Education tests, 430. 

Elliott, E. C., 0, 171, 458. 

Engineering tests, 433. 

English exercises and tests, 220, 
273, 279, 427. Also see dictionary 
test, literature test, grammar 
test. 

Equivalent tests, 203. 

Errors, 26, 44. 

Errors of estimate, 44. 

Errors of measurement, 44. 

Essay examinations. See discus^ 
sion examinations. 

Examples for pupils, 238, 

Exemption from examinations, 50. 

F 

Factors which determine marks. 
See basis of marks. 

Failing marks, 154, 156. 

False accuracy in marking. See 
unreliability of marks. 

Farwell, H. W., 420, 444, 

Ferson, M. L., 434. 

Finkelstein, I. E., 444. 


Five-symbol marking systems, 137, 
152. 

Foote, Marie, 444. 

Foreign language tests. See 
French tesis^ (lerniam. tests, 
Latin tests^ Spanish tests. 

Foster, H. D., 445. 

Foster, R. R., 288, 341, 445. 

Four-symbol marking systems, 155. 

Franzen, R. 11., 431. 

Frasier, G. W., 432. 

French, H. P., 106, 445. 

French tests, 322, 309, 419, 427. 

Frequency of tests, 55, 64. 

Fritz, M. F., 338, 445. 

Functions of e.xaminations. 8ee 
purposes of examinations. 

G 

Gates, A. I., 191, 194, 428, 445. 

Gathany, J. M., 09, 440. 

General science exercises and tests, 
208, 210, 235, 313, 349, 398, 
401. 

Geneseo Scale of Qualities, 118. 

Genus-species tests, 205. 

Geograpliy exercises and tests, 
208, 209, 230, 236, 207, 293, 320, 
380, 397. Also see eommercial 
geography tests, physiography 
tests. 

Geometry tests, 301, 402, 427. 

George Peabody College for Teach- 
ers, 420. 

German tests, 278, 381, 427. 

Geycr, D. L., 431, 440. 

Grammar exercises and tests, 
208, 225, 352, 371, 417. 

Gray, W. S,, 440. 

Greene, C. E., 446. 

Guessing, 57, 180, 250, 283, 330, 
343. 

H 

Hammond, E. L., 240, 447. 

Hannig, W. A., 430, 447. 



466 


INDEX 


Hayhurst, B. E., 447. 

History exercises and tests, 210. 
Also see American history tests, 
ancient history tests, medieval 
and modem history tests. 
History of examinations, 3. 

Home economics tests, 324, 348. 
Also see cooking tests, seioing 
tests. 

Honor system. See cheating. 
Hopkins, L. T., 249, 447. 

Hulten, C. E., 167, 447. 

Hunter, 0. B., 425, 448. 

I 

Idaho, University of, 428. 
Identification tests, 395. 

Illinois marking systems, 133. 
Improvement of marking system. 
See marks. 

Incorrect statements tests, 387. 
Influence of incentives on marks, 
145. 

Institutions of higher learning, 
422, 

Intangible products of education, 
17. 

Item, 240. 

J 

Johnson, F. W., 6, 39, 448. 
Judgment tests. See verification 
tests. 

K 

Kelly, F. J., 6, 90, 448. 

Kinder, J. S., 194, 248, 449. 

Knight, F. B., 431. 

Kolstoe, S, 0., 194, 449. 

L 

Laird, D. A., 428, 449. 

Langlie, T. A., 340, 454. 


Language habits, 16, 91, 199. 

Latin exercises and te^ts, 208, 311, 
323, 355, 385, 403. 

Law test, 434. 

Learning, 37. 

Length of tests, 50, 64, 240, 377. 

Letter marks, 135, 137, 152. 

Liberal arts courses, 424. 

Library tests, 433. 

Limitations of new examinations, 
175. 

Limitations of standardized tests, 
27. 

Limitations of traditional exam- 
inations, 183. 

Limits of marks, 156, 157. 

Literature exercises and tests, 
207, 208, 209, 210, 233, 268, 277, 
291, 381. 

Logic test, 428. 

Lowell, A. L., 19, 449. 

M 

MacKinnon, Flora L., 212. 

Mann, Horace, 4. 

Manual training test, 258. 

Marking non-average groups, 149, 
158. 

Marking small groups, 151. 

Marks, 97, 109, 141. 

Marschat, L. E., 455. 

Marsh, W. K., 450. 

Masters, H. G., 116, 451. 

Matching tests, 376. 

Mathematics tests. See algebra 
tests, arithmetic tests, geometry 
tests, trigonometry tests. 

May, M. A., 104, 248, 341, 428, 
429, 452. 

McAfee, L. 0., 195, 450. 

McCall, W. A., 336, 450. 

Mead, A. R., 431, 452. 

Mean, 44, 

Meaning of marks. See basis of 
marking. 



INDEX 


467 


Measurement and improvement of 
teaching efficiency, 36. 

Measurement of pupil ability and 
achievement, 35. 

Median, 25. 

Medicine tests, 434. 

Medieval and modern history tests, 
309, 326, 384, 410. 

Meltzer, H., 441. 

Miller, G. F., 451, 452. 

Miller, W. S., 429, 452. 

Minnesota, University of, 429. 

Misuse of normal distribution of 
marks, 151. 

Monroe, W. S., 22, 43, 205, 452, 
453. 

Morley, E. E., 50, 453. 

Moss, F. A., 425, 448. 

Motivation, 38, 47, 112, 125. 

Multiple-answer tests, 281. 

Multiple-choice tests. See multiple- 
answer tests. 

Multiple-description tests, 329. 

Multiple-example tests. See plural- 
example tests. 

Multiple-reason tests, 322, 

Music tests, 261, 274. 

N 

New examinations, 22, 175, 237. 

New-type tests. See new examina- 
tions. 

New York City Board of Exam- 
iners, 430. 

New York Regents’ Examinations, 
190, 422. 

’ Normal distribution of marks, 
141. 

Norms, 19, 48, 201. 

0 

Objective. See objectivity. 

Objective tests. See new examina- 
tions. 


Objectivity, 18, 40, 250. 

Odell, C. W., 43, 133, 432, 453. 
454. 

Ohio Wesleyan University, 431. 
Omnibus test, 242. 

Opdyke, J. R., 454. 

Oral testing, 245, 251, 257, 283. 
Ordinary matching tests, 378. 
Ordinary multiple-answer tests, 
289. 

Ordinary single-answer tests, 258. 
Overemphasis on examinations, 14. 


P 

Paragraph completion tests, 365, 
367, 369, 371. 

Partial enumeration tests, 272. 

Paterson, D. G., 340, 454. 

Penalizing wrong answers, 287, 
339. 

Percentile marks, 135. 

Philosophy tests, 212. 

Physical and mental results of ex- 
aminations, II, 16. 

Physics exercises and tests, 231, 
235, 308, 372, 398, 420. 

Physiography exercises and tests, 
275, 414. 

Physiology exercises and tests, 
208, 236, 271, 330, 302, 401. 

Pledge, 74. 

Plural-choice tests. See plural 
multiple-ansxccr tests. 

Plural-example tests, 272. 

Plural multiple-answer tests, 310. 

Preparation of examinations, 26, 
59, 337, 360. 

Prevention of cheating. See cheat - 
ing. 

Probable error of estimate, 44. 

Probable error of measurement, 
44. 

Professional courses, 430. 

Pronunciation tests, 279. 



468 


INDEX 


Providing opportunities for learn- 
ing, 37. 

Psychology tests, 428. 

Pupils’ reactions to tests, 14, 85, 
193. 

Purposes of examinations, 32, 177, 
240, 256, 283, 335, 35S, 306, 423. 

Purposes of marks. 111. 

Q 

Qualities of good examinations, 
40. 

Quotations in tests, 211, 233. 

Quotient scores, 124. 

R 

Ravnaldo, D. A., 191, 427, 428, 
442. 

Reading tests, 296, 351. 

Rearrangement tests. See con- 
iinuiti/ tests. 

Recall tests. See single-answer 
tests. 

Recognition tests. See multiple- 
answer tests. 

Reeder, J. C., 120, 455. 

Regularity of tests, 55. 

.Reliability, 5, 41, 249. Also see 

I unreliability of marks. 

^eminers, E. M., 336, 455. 

Remmers, H. H., 336, 420, 455. 

Repetition of questions, 62. 

Returning papers, 72, 95, 

Revealing questions in advance, 

68 . 

Right-minus-wrong formula, 287, 
339. 

Rueh, G. M., 43, 190, 192, 249, 
287, 288, 336, 339, 340, 341, 
431, 445, 455, 456. 

Rugg, H. 0., 130, 450. 

Russell,' Charles, 185, 457. 


Sampling pupils’ knowledge, 184. 
Scores, 29, 97. 


Scoring tests, 71, 81, 86, 246, 257, 
287, 305, 339, 361, 378, 390, 
390, 406, 413, 417. 

Selection of incorrect answers, 
243, 286. 

Self-correlation, 41. 

Setting of goals, 40. 

Seven-symbol marking systems, 
155. 

Sewing tests, 272, 303. 

Similarities tests, 295. 

Simple completion tests, 362. 

Single-answer tests, 255. 

Single-answer tests of erne exer- 
cise each, 261. 

Single-example tests, 270. 

Six-symbol marking systems, 137, 
155. 

Skew distribution, 141. 

Sorting method, 89. 

Senders, L. B., 22, 453. 

Spanish tests, 264, 271, 427. 

Spelling tests, 296, 353, 305. 

Spence, R B., 457. 

Spencer, P. L., 84, 443. 

Stamford, Connecticut, High 
School marking system, 132. 

Standard deviation, 44. 

Standard distribution of marks. 
See distribution of 7nnrks, 

Standard tests. See standardized 
tests. 

Standardized tevsts, 8, 10, 424. 

Starch, Daniel, 6, 136, 170, 171, 
457, 458. 

Stoddard, G. D., 34, 434, 466. 

Stormzand, M. J., 458, 

Subject quotient, 124. 

Symonds, P. M., 33, 458. 

Syracuse University, 428. 


Telford,* Ered, 431. 
Testing speed, 199. 
Thoma, W. M., 93, 458. 



INDEX 


469 


Thought questions, 205, 

Three-symbol marking systems, 
155. 

Time devoted to examinations, 16, 
26, 46. 

Timing of tests, 55, 65, 285, 360. 

Toops, H. A., 249, 459. 

Trabue, M. R., 434, 459. 

Traditional examinations. See dis- 
cussion examinations. 

Transmutation of scores. See 
changing scores into marks. 

Trigonometry tests, 363. 

True-false tests, 191, 334, 347. 

Types of mental activity, 205. 

U 

Unannounced tests, 55. 

Universities. See institutions of 
higher learning. 

University High School, Univer- 
sity of Illinois, 122. 

Unreliability of marks, 5, 167, 188. 

Use of books, etc., during examina- 
tions, 68. 

Use of examples, 67. 


Valid. See validitg. 

Validity, 18, 45, 59, 195, 248. 
Variability of marks. See unre- 
liahility of marks. 


Variable errors, 26. 

Variety in examinations, 54, 63, 
244, 248. 

Verification tests, 412. 


W 

Webb, H. A., 426, 459. 

AVeber, J. J., 432, 459. 
W’'eidemann, C. C., 337, 459. 
W’eighting different kinds of work, 
127. 

Weighting exercises and items, 81, 
188. 

Weld, L. D., 166, 460. 

Whitten, C. W., 120, 460. 

Wood, B. D., Ill, 190, 195, 340, 
422, 425, 426, 433, 460, 461. 
Wood, E. P., 248, 340, 430, 462. 
Writing tests on blackboard, 245, 
251, 284. 

Wyoming, University of, 428. 


Yes-no tests, 350. 


Z 

Zero scores, 347. 

Zoology exercises and tests, 231, 
295, 331. 








